Rhino forms nuclear foci and binds the genome in OSCs

(A) Confocal immunofluorescent images of FLAG-Rhino (red) in FLAG-Rhino inducible OSCs upon Dox induction (5 ng/mL). H3K9me3 and OSC nucleus (DAPI staining) are shown in green and blue. Scale bar represents 10 µm. (B) Genome snapshot showing ChIP-seq in major dual-strand piRNA clusters (42AB and 38C). The gray-shaded areas indicate the sites at which FLAG-Rhino is localized in each piRNA cluster in OSCs. piRNA cluster 38C has two piRNA-producing regions, labeled 38C1 and 38C2. (C) Heatmap showing FLAG-Rhino localization of OSCs in Rhino peaks of Drosophila wild-type Oregon-R ovary. The regions of 6 kb around the Rhino localization site were plotted based on macs2 callpeak results. (D) Metaplot showing the localization of FLAG-Rhino in OSCs at piRNA clusters. For piRNA clusters, broad peaks of Rhino localization in the ovary greater than 1.5 kb were used. The overlap of each Rhino peak with annotated piRNA clusters (e.g., 42AB, 38C, and 80F) was identified, as listed in Supplementary Table 9. (E) Venn diagram showing the overlap of Rhino, H3K9me3, and H3K9me2 peaks in OSCs. (F) Heatmap showing the signals of ChIP-seq using anti-FLAG, anti-H3K9me3, and anti-H3K9me2 in FLAG-Rhino peaks of OSCs. The regions of 2 kb around the FLAG-Rhino localization sites were plotted based on macs2 callpeak results. (G) Heatmap showing the signals of ChIP-seq using anti-H3K9me3, anti-FLAG, anti-HP1a antibodies at H3K9me3 broad peaks of OSCs. The regions of 10 kb around the FLAG-Rhino localization sites were plotted based on macs2 callpeak results.

ADMA-histones tend to co-localize with FLAG-Rhino in OSCs

(A) Heatmap showing key histone modifications and HP1a ChIP-seq results at FLAG-Rhino localization sites (828 regions) in OSCs. The regions of 2 kb around the FLAG-Rhino localization sites were plotted based on macs2 callpeak results. (B) Genome snapshot showing all histone modifications’ ChIP-seq data in major dual-strand piRNA clusters (38C). The gray-shaded areas indicate the localization sites of FLAG-Rhino in each piRNA cluster in OSCs. (C) Metaplot showing the localization of ADMA-histones and FLAG-Rhino in OSCs at piRNA clusters. For piRNA clusters, broad peaks of Rhino localization in the ovary greater than 1.5 kb were used. (D) Pearson correlation coefficient comparing the coverage per 100 bp bin size of all ChIP-seq mapping results to the dm6 genome and its hierarchical clustering. (E) Venn diagram showing sites of overlapping localization of FLAG-Rhino, H3K9me3, H3R2me2a, H3R8me2a, H3R17me2a, H3R26me2a, H4R3me2a, and all ADMA-histones in OSCs.

Arginine methyltransferases DART1 and DART4 play a role in Rhino genomic localization and foci formation in OSCs

(A) Confocal immunofluorescent images of FLAG-Rhino (red) and nuclei (blue; DAPI stained) in the EGFP-, HP1a-, DART1-, DART4-, DART8-, and CG17726-KD OSCs. Scale bar represents 10 µm. (B) Comparison of FLAG-Rhino foci sizes among the EGFP-, HP1a-, DART1-, DART4-, DART8-, and CG17726-KD OSCs in (A). Thirty OSCs from three biological replicates were analyzed. Statistical significance in differences of Rhino foci between the genotypes was examined using Kolmogorov-Smirnov test. *P < 0.05, **P < 0.01, and ***P < 0.001. HP1a KD: ***P = 8.49 × 10−13, DART1 KD: ***P = 1.83 × 10−7, DART4 KD: ***P = 2.83 × 10−4, DART8 KD: n.s. P = 0.655, CG17726 KD: n.s. P = 0.454. The box indicates the lower (25%) and upper (75%) quartiles, and the open circle indicates the median. Whiskers extend to the most extreme data points. (C) ChIP-seq profile showing the localization of FLAG-Rhino, H3R2me2a, H3R8me2a, H3R17me2a, H3R26me2a, and H4R3me2a in EGFP-, DART1-, and DART4-KD OSCs at FLAG-Rhino localization sites. The genomic regions used to draw the profile are the same as in Fig. 1G. (D) Heatmap showing the localization of FLAG-Rhino, H3R17me2a, and H3R8me2a in EGFP– and DART4-KD OSCs at the sites of decreased FLAG-Rhino localization in DART4 KD. The regions with decreased FLAG-Rhino were defined as those with an M value greater than 0.5 by manorm. macs2 callpeak p = 0.05 was used for manorm with input as a control. (E) FLAG-Rhino ChIP-qPCR results of one site at which Rhino initially localized. DART1 KD: †P = 5.88 × 10−2, DART4 KD: * P = 3.17 × 10−2. (F) Venn diagram showing the overlap of H3R17me2a localization decreased in DART4 KD and H4R3me2a localization decreased in DART1 KD. The regions with decreased ADMA-histones were defined as those with an M value greater than 1 in manorm. macs2 callpeak p = 0.01 was used for manorm. (G) Venn diagram showing the overlap of Rhino localizations that were decreased by DART1 KD and DART4 KD. The decreased genomic sites are identified as in (F).

ADMA-histones co-localize with Rhino prior to Rhino spread in the ovary

(A) Genome browser snapshot showing the localization of H3R2me2a, H3R8me2a, H3R17me2a, H3R26me2a, and H4R3me2a in piRNA clusters in OSCs and Oregon-R ovary. The gray-shaded areas indicate the localization sites of FLAG-Rhino in each piRNA cluster in OSCs. (B) Heatmap showing the localization of H3R2me2a, H3R8me2a, H3R17me2a, H3R26me2a, and H4R3me2a at Rhino peaks in Oregon-R ovary. The Rhino peak regions shown in Fig. 1C were used. (C) Metaplot showing the localization of ADMA-histones in the ovary at piRNA clusters. For piRNA clusters, broad peaks of Rhino localization in the ovary greater than 1.5 kb were used.

DART4 GLKD has impacts on Rhino foci and Rhino genomic localization in the ovary

(A) Confocal immunostaining images of Rhino (red) in ctrl– and DART4-GLKD ovaries. DAPI shows nuclei (blue). Scale bar represents 10 µm. (B) Comparison of sizes of Rhino foci between the ctrl– and DART4-GLKD ovaries in (A). Ten ovaries from two biological replicates were analyzed. Statistical significance in differences of Rhino foci between the two genotypes was examined as in Fig. 3B. DART4 GLKD: ***P = 1.26 × 10−10. (C) ChIP-seq profile showing the localization of Rhino and H3R17me2a in ctrl– and DART4-GLKD ovaries at the FLAG-Rhino localization sites in OSCs. The FLAG-Rhino peak regions, as in Fig. 1G, are used. (D) Rhino ChIP-qPCR results of one site at which Rhino initially localized and piRNA cluster 42AB. t-test was used for statistical hypothesis testing. Rhino initial localization point: * P = 4.03 × 10−2, piRNA cluster 42AB: n.s. P = 0.11. (E) Genome snapshot of ChIP-seq data in euchromatic regions. The gray-shaded areas indicate genomic locations where Rhino localization is decreased in a H3R17me2a-dependent manner. (F) Scatter plot comparing Rhino enrichment in ChIP-seq between ctrl GLKD and DART4 GLKD. Each dot represents the square root of the read count per million mapped reads (RPM) within the annotated piRNA clusters in piRNAcluster database42 (colored in black, 184 clusters, without clusters on chromosome Y and uni-strand piRNA clusters flamenco/20A) and FLAG-Rhino loaded points in OSCs as in Fig. 5C (colored in red, 828 regions). The dotted lines represent the least-squares linear regression fit to the data points: the black line for 184 piRNA clusters and the red line for 828 FLAG-Rhino–loaded points in OSCs.

DART4-dependent piRNA source loci (DART4 piSL) are genomic loci where Rhino spreads from H3R17me2a

(A) Venn diagram showing the overlap between sites with reduced Rhino localization and sites where H3R17me2a was lost in DART4 GLKD. The regions where Rhino was decreased were defined as those with an M value greater than 0.5 by manorm, and the regions where H3R17me2a was lost were defined as those with an M value greater than 0. For manorm, the output results of macs2 callpeak p = 0.01 were used with input as a control. (B) Scatter plot comparing Rhino enrichment in ChIP-seq between ctrl GLKD and DART4 GLKD. Each dot represents the square root of the read count per million mapped reads (RPM) within the annotated piRNA clusters in piRNAcluster database42 (colored in black, 184 clusters, without clusters on chromosome Y and uni-strand piRNA clusters flamenco/20A) and Rhino reduced regions in DART4 GLKD (colored in green, 43 clusters). The dotted lines represent an approximation by a linear regression model. (C) Chromosomal locations of Rhino reduced regions in DART4 GLKD and the estimated direction of Rhino spread based on initial localization in OSCs. Depressed areas indicate pericentromeres. Arrows show the spread direction based on ADMA-histones. Bidirectional arrows indicate that ADMA-histones are present at both ends and the direction of spread cannot be determined. Horizontal bars indicate clusters without ADMA-histones at both ends. (D) Genome snapshot showing Rhino reduced regions in DART4 GLKD around the bx gene. Black lines mark Rhino reduced regions boundaries, and gray-shaded areas highlight potential ADMA-histone localization at cluster ends. (E) ChIP-seq profiles of Rhino and H3R17me2a in ctrl GLKD and DART4 GLKD across all 43 Rhino reduced regions in DART4 GLKD, and Rhino in OSCs. (F) Scatter plot comparing piRNA enrichment between ctrl GLKD and DART4 GLKD. Each dot represents the logged RPM value within piRNA clusters (black: 184 annotated clusters; green: 43 Rhino reduced regions in DART4 GLKD). Dotted lines indicate linear regression. (G) Box plot showing RPKM of piRNAs uniquely mapped to Rhino sites in OSCs, Rhino reduced regions in DART4 GLKD, all piRNAcluster database42-annotated clusters, and highly transcribed clusters (42AB, 38C, 80F, 102F). Clusters with RPKM below 400 are shown.

ADMA-histones are enriched at the ends of evolutionarily young piRNA clusters

(A) RPKM of each ChIP-seq dataset (Oregon-R) in the piRNA cluster dataset with eight levels of evolutionary novelty. RPKM was calculated for the 500 bp region surrounding the ends of the piRNA clusters. t-test was used for statistical hypothesis testing between clusters found only in 1 strain (cluster 1, newly emerged) and clusters found in all 8 strains (cluster 8, evolutionarily conserved). All ADMA-histones: ** P = 1.11 × 10−3, H3R2me2a * P = 1.45 × 10−2, H3R8me2a: *** P = 1.27 × 10−4, H3R17me2a: ** P = 7.28 × 10−3, H3R26me2a: n.s. P = 0.23, H4R3me2a: ** P = 1.91 × 10−3, input: * P = 0.53. (B) Same analysis of Fig. 7A for each ChIP-seq dataset (ctrl GLKD and DART4 GLKD). H3R17me2a (ctrl GLKD): ** P = 2.77 × 10−3, H3R17me2a (DART4 GLKD): n.s. P = 0.24, input (ctrl GLKD): n.s. P = 0.42, input (DART4 GLKD): n.s. P = 0.47. (C) Model for the piRNA source loci formation. In the cell conditions lacking Kipferl such as in OSCs, Rhino initially localizes to bivalent nucleosomes containing H3K9me3 and ADMA-histones (left). In the second step, Rhino spreading occurs, leading to the formation of DART4 piSL. Transcription of adjacent regions and/or H3K9me2 enrichment may play a role in this process (right). Rhino in DART4 piSL is lost if ADMA-histones are lost. It remains unclear whether DART4 piSL can eventually develop into highly transcribed piRNA clusters, such as 42AB, over the course of evolutionary time.

Expression and localization of FLAG-Rhino at different Dox concentrations

(A) Confocal immunofluorescent images of FLAG-Rhino (green) in FLAG-Rhino inducible OSCs upon Dox induction (from 0.01 to 1.0 ng/mL). DAPI shows nuclei (blue). Scale bar represents 10 µm. (B) Western blotting of FLAG-Rhino under each Dox concentration. HP1a was detected as an internal control. (C) Genome snapshot showing ChIP-seq in major dual-strand piRNA clusters (80F and 102F). The gray-shaded areas indicate the sites at which FLAG-Rhino is localized in each piRNA cluster in OSCs. (D) Heatmap showing the localization of FLAG-Rhino in OSCs under various Dox concentrations and Rhino in ovary at FLAG-Rhino peaks of OSCs. The regions of 2 kb around the FLAG-Rhino localization sites were plotted based on macs2 callpeak results. (E) Genome snapshot showing ChIP-seq in major dual-strand piRNA clusters (42AB and 38C). The gray-shaded areas indicate the sites at which FLAG-Rhino is localized in each piRNA cluster in OSCs. piRNA cluster 38C has two piRNA-producing regions, labeled 38C1 and 38C2.

Behavior of ADMA-histones in FLAG-Rhino inducible OSCs

(A) Genome snapshot showing all histone modifications ChIP-seq data in major dual-strand piRNA clusters (42AB, 80F and 102F). The gray-shaded areas indicate the localization sites of FLAG-Rhino in each piRNA cluster in OSCs. (B) Venn diagram showing the sites of overlapping localization of each ADMA-histone. (C) Venn diagram showing the sites of overlapping localization of ADMA-histones, HP1a, and H3K9me3.

Behavior of FLAG-Rhino, HP1a, and ADMA-histones under DART1-KD and DART4-KD conditions

(A) Confocal immunofluorescent images of FLAG-Rhino (red) and H3K9me3 (green) in FLAG-Rhino inducible OSCs. Dox was added at a final concentration of 5 ng/mL. DAPI shows nuclei (blue). Scale bar represents 10 µm. (B) Confocal immunofluorescent images of HP1a (red) and H3K9me3 (green) in FLAG-Rhino inducible OSCs. Dox was added in final concentration of 5 ng/mL. DAPI shows nuclei (blue). Scale bar represents 10 µm. (C) Venn diagram showing the overlap of the decreased H3R8me2a regions in DART1 KD and DART4 KD. The decreased regions were defined as those with an M value greater than 0.5 in manorm. macs2 callpeak p = 0.01 was used for manorm with input as a control. (D) Venn diagram showing the overlap of the regions where H3R8me2a and H3R17me2a decreased in DART4 KD. The decreased regions were identified using the same conditions as in (c). (E) Venn diagram showing the overlap of the regions where H3R8me2a and H4R3me2a were decreased in DART1 KD. The decreased regions were identified using the same conditions as in (C) and Fig. 3F, G. (F) Western blotting using anti-FLAG, anti-βTub anti-H3R17me2a antibodies on total lysates of FLAG-Rhino inducible OSCs. Dox was added at a final concentration of 5 ng/mL. (G) Western blotting using anti-ADMA, anti-HP1a, anti-FLAG antibodies on immunoprecipitated nuclear lysates of FLAG-Rhino inducible OSCs. Dox was added at a final concentration of 5 ng/mL. Anti-H3R17me2a antibody was used as a positive control for the reactivity of the anti-ADMA antibody. (H) Scatter plot comparing gene expression levels between OSCs and ovaries. Each dot represents a single gene plotted by its TPM values (log10). RNA-seq data for OSCs and ovaries were obtained from DRR05098823 and SRR11879526, respectively. (I) Comparison of qRT–PCR results for DART1, DART4, DART8, and CG17726 in OSCs under EGFP knockdown conditions (n = 3). (J) qRT–PCR analysis of DART1, DART4, DART8, and CG17726 in OSCs under each gene knockdown condition (n = 3).

ADMA-histones localize at the ends of Kdm3 GLKD specific ectopic piRNA clusters

Genome browser snapshot showing the localization of H3R2me2a, H3R8me2a, H3R17me2a, H3R26me2a, and H4R3me2a in piRNA clusters in OSCs and Oregon-R ovary. The gray-shaded areas indicate the localization sites of ADMA-histones in Oregon-R ovary.

Behavior of Rhino and ADMA-histones under DART4-GLKD conditions

(A) qRT-PCR for DART4 in ovaries under ctrl-GLKD and DART4-GLKD conditions (n = 3). (B) ChIP-seq heatmap showing the localization of Rhino and H3R17me2a in ctrl-GLKD and DART4-GLKD conditions at the FLAG-Rhino localization sites in OSCs. (C) ChIP-seq heatmap showing the localization of Rhino and H3R17me2a in ctrl-GLKD and DART4-GLKD conditions at Rhino localized regions in WT Oregon-R ovary. (D) Genome snapshot of the indicated ChIP-seq data in piRNA clusters, 42AB, 80F, 38C, and 102F. (E) Genome browser snapshot of the indicated ChIP-seq data in representative ectopic piRNA clusters appearing in Kdm3 GLKD (R2_bi region, R4_jing region and R7_opa/laf region). The gray-shaded areas indicate the localization sites of potential FLAG-Rhino considered to be the ends of the ectopic piRNA cluster in OSCs. (F) Metaplot showing ADMA-histones at the major ectopic piRNA clusters R1-R12 (R5 was excluded because of low Rhino localization in Kdm3 GLKD) that appeared in Kdm3 GLKD. Kdm3 GLKD-dependent ectopic piRNA cluster ends were extracted using the following method. Among Rhino peaks listed in the MACS2 callpeak files “GSE203279_MACS2_narrowpeak_Rhino_Ctrl.txt” and “GSE203279_MACS2_ narrowpeak_Rhino_Kdm3GLKD.txt” in the GSE203279 dataset, we first extracted the Rhino peaks that appeared only in the latter13. The peaks closest to the 5′ and 3′ ends of the R1-R12 regions in Table S4 of the Casier et al. (2023) were designated as the ends of ectopic piRNA clusters13.

piRNA source loci with H3R17me2a at the ends lose Rhino localization in DART4 GLKD

(A) Genome snapshot of ChIP-seq results of Rhino reduced regions in DART4 GLKD, as shown in Fig. 6D. Rhino reduced region in DART4 GLKD with remarkable loss of Rhino are shown. Black lines indicate boundaries of Rhino reduced regions in DART4 GLKD. The gray-shaded areas indicate the localization sites of potential ADMA-histones considered to be the ends of Rhino reduced regions in DART4 GLKD. (B) ChIP-seq profile showing enrichment of the indicated histone modifications in Rhino reduced regions in DART4 GLKD. ChIP-seq data from WT Oregon-R ovaries were used for each histone modification. SDMA: symmetric dimethylarginine. (C) qRT-PCR showing the expression of transposons and genes within Rhino reduced regions in DART4 GLKD (bx, CG31612, CG14629, and Nf-Yb). Primers are listed in Supplementary Table 1. (D) Egg hatching rate showing fertility in DART4 GLKD. (E) Proportion of transposon/gene sequences contained within Rhino reduced regions in DART4 GLKD and annotated all piRNA clusters. (F) Ratio of piRNA reads mapped to transposons in ctrl-GLKD ovaries72 within annotated all piRNA clusters and Rhino reduced regions in DART4 GLKD. The number of piRNA reads in each category is as follows: Transposons in all clusters: 1,020,456; intragenic regions in all clusters: 156,244; intergenic regions in all clusters: 123,370; transposons in Rhino reduced regions in DART4 GLKD: 2,568; intragenic regions in Rhino reduced regions in DART4 GLKD: 71,318; intergenic regions in Rhino reduced regions in DART4 GLKD: 9,907. (G) Box plot showing the lengths of Rhino localization sites in OSCs, Rhino reduced regions in DART4 GLKD, all piRNA clusters annotated in proTRAC (as in Fig. 6B), and highly transcribed piRNA clusters (42AB, 38C, 80F, and 102F).

Kipferl and Rhino behavior in DART4 GLKD

(A) ChIP-seq profile of Kipferl enrichment in bx, CG14629, and CG31612 DART4 piSL. (B) Genome snapshot of DART4 piSL in which a significant decrease in Kipferl was observed in DART4 GLKD. (C) Pie chart showing the number of Rhino peaks in piRNA clusters, transposons, and DART4 piSL in ctrl GLKD and DART4 GLKD. The numbers for each are as follows. piRNA clusters in ctrl GLKD: 282; transposons in ctrl GLKD: 114; DART4 piSL in ctrl GLKD: 60; piRNA clusters in DART4 GLKD: 402; transposons in DART4 GLKD: 173; DART4 piSL in DART4 GLKD: 30. (D) Boxplot showing Rhino enrichment in satellite repeats.

Control for the evolutionary analysis in Fig. 7

(A) RPKM of each ChIP-seq dataset in the piRNA cluster found in all 8 strains (cluster 8, evolutionarily conserved), random genomic regions, the whole genome, and DART4 piSL. Genomic regions with RPKM values above 30 in the input were excluded as they were considered ChIP-seq background noise. For the random genomic regions, 500 bp regions were randomly selected using Python’s random.choice to match the number of piRNA clusters. The whole genome represents the full length of chromosomes 2L, 2R, 3L, 3R, 4, and X. t-test was used for statistical hypothesis testing between clusters found in all 8 strains (cluster 8, evolutionarily conserved) and other indicated genomic regions. Random genomic regions are shown in Supplementary Table 10, and the p-values from the t-test are shown in Supplementary Table 11. (B) Same analysis of Fig. 7A and Supplementary Fig. 8A for each ChIP-seq dataset (Oregon-R) in the piRNA cluster dataset with eight levels of evolutionary novelty in which piRNA reads were detected in Oregon-R strain. (C) Same analysis of Fig. 7A for each ChIP-seq dataset (ctrl GLKD and DART4 GLKD) in the piRNA cluster dataset with eight levels of evolutionary novelty in which piRNA reads were detected in ctrl GLKD strain.

Validation of antibody specificity

(A) Western blotting on total lysates of FLAG-Rhino-inducible OSCs. The anti-Rhino antibody used in immunofluorescence (Fig. 5A) was used. (B) Heatmap showing the localization of H3R8me2a (antibodies produced by different industries) in OSCs at ChIP-seq peaks using the H3R8me2a antibody (Active Motif). The regions of 2 kb around the H3R8me2a localization sites were plotted based on macs2 callpeak results. (C) Western blotting using anti-H3R17me2a antibody on total lysates of FLAG-Rhino inducible OSCs. (D) Histone peptide array using H4R3me2a antibody. The table shows the top 10 single modification peptides with the highest signal intensity.