Abstract
Protein-protein interactions are the fundamental features for understanding the molecular functions and regulations of proteins. Despite extensive databases, many interactions remain uncharacterized due to the intensive labor required for experimental validation. In this study, we utilized the AlphaFold2 program to predict interactions among proteins localized in the nuage, a germline-specific non-membrane organelle critical for piRNA biogenesis and RNA regulation. We screened 20 types of nuage proteins for 1:1 interactions and predicted dimer structures. Among those, five pairs represented novel interaction candidates. Three pairs, including Spn-E_Squ, were validated through co-immunoprecipitation in cultured cells and confirmed the interactions. Disruption of the salt bridges at the Spn-E_Squ interface verified their functional importance, underscoring the predictive model’s accuracy. Our analysis was extended to include interactions between three representative nuage components, Vas, Squ, and Tej, and approximately 430 oogenesis-related proteins. Following this extended analysis, co-immunoprecipitation in S2 cells verified interactions for three pairs: Mei-W68_Squ, CSN3_Squ, and Pka-C1_Tej. Furthermore, the majority of Drosophila proteins, ∼12,000, were screened for the interaction with Piwi protein, a central player in the piRNA pathway. Approximately 1.5% of the pairs, totaling 164 pairs, with a score above 0.6, were identified as potential binding partners. This in silico approach not only efficiently identifies potential interaction partners but also significantly reduces the gap by facilitating the integration of bioinformatics and experimental biology.
Introduction
Around 10,000 to 20,000 different types of proteins are encoded in the genome of most organisms, catalyzing the vast majority of physico-chemical reactions in cells1. Many proteins have specialized functions and they are often regulated through protein-protein interactions, where the formation of protein complexes can activate, inhibit, or stabilize their partners. Furthermore, protein-protein interactions can recruit target proteins to specific locations where they will function or regulate the mobility of the protein complex2. Within cells, proteins are thought to exist in a crowded environment and frequently interact with other molecules3. Thus, characterizing protein-protein interactions is fundamental for understanding protein function and regulation. Large-scale analyses of protein-protein interactions have been carried out, including Tandem Affinity Purification coupled with Mass Spectrometry (TAP-MS) for the yeast proteome4 and the comprehensive 2-hybrid screening for the Human Reference Interactome (HuRI)5. Despite these extensive studies, the overall protein-protein interactions are still not fully understood in many organisms.
The binding between proteins is significantly influenced by their three-dimensional (3D) structures. The characteristics of their interfaces, including hydrogen bonds, salt bridges, and hydrophobicity, determine the interactions6. Therefore, to analyze protein-protein interactions physically and chemically, information on the individual 3D structures of proteins is necessary. The 3D structures of proteins have been determined through experimental methods such as X-ray crystallography, nuclear magnetic resonance (NMR), and cryo-electron microscopy7. However, these techniques demand considerable labor and time. The recently developed AlphaFold2 program can predict the 3D structure from its amino acid sequence with high accuracy8. This tool has not only been utilized in computational studies but has also become a valuable resource in experimental sciences for predicting protein complexes, as demonstrated with yeast protein complexes9.
In this study, we attempted a rapid screening of the protein interactions using AlphaFold2 prediction, primarily focusing on components of nuage, a germline specific, non-membrane organelle that involves wide variety of proteins containing unique motifs and domains in Drosophila melanogasater10. Nuage is known to serve as the production and amplification site for small non-coding piRNA, which is bound to PIWI-family proteins. The piRNAs and the PIWI family proteins function to repress mobile genetic elements, or transposons, that disrupt the genomes through their active transpositions11. Not only proteins involved in piRNA production, but also translation repressor proteins including Me31B, Cup, and Trailer hitch (Tral), also localize in nuage12. Previous studies have shown that the localization of several components in nuage depends on their partners in a hierarchical manner13. However, the interaction and organization among nuage components remain unclear.
By using AlphaFold2 predictions, we investigated 20 of the nuage-localizing or piRNA-related proteins for pairwise interactions. We confirmed the novel interactions of candidate pairs including Spindle-E (Spn-E)_Squash (Squ), by co-immunoprecipitation assay using cultured cells. In addition, a Squ mutant, which disrupts the salt bridges predicted at the interface with Spn-E, failed to interact with Spn-E, validating the accuracy of the predicted dimer structure. This screening was expanded for direct interacting pairs between piRNA-related proteins and proteins involved in oogenesis, as well as Piwi and other Drosophila proteins. This in silico approach not only streamlines the identification of interaction partners but also bridges the gap between bioinformatics predictions and experimental validation in biological research.
Result/Discussion
The nuage-localizing proteins and piRNA-related proteins used in the AlphaFold2 screening
Several dozen proteins engaged in piRNA production in germline cells exert their function by recruiting piRNA precursors and interacting with their partner proteins, forming non-membrane structure called a nuage10,13. Previous studies reported that many piRNA-related proteins localized to nuage and some proteins localized in mitochondria (Table 1). In addition, protein components of processing bodies and sponge bodies, which are involved in the translation, storage, degradation, and transportation of mRNAs—such as Me31B, Cup, and Tral—also localize to nuage12 (Table 1). However, the details of how these proteins interact and organize themselves within the nuage remain unclear.
In this study, we used the AlphaFold2 program to screen for interactions among 20 proteins that are localized in the nuage and/or involved in piRNA production in Drosophila (Table 1). The monomeric structures of these 20 proteins, ranging in size from 20 kDa to 250 kDa, have already been predicted and are registered in databases14. This set includes both well-structured proteins and those that are largely disordered with numerous loops (Supplemental Fig. S1A). Of those, eight proteins feature one or more Tudor domains or extended Tudor (eTud) domains. The Tudor domain contains approximately 60 residues and folds into an antiparallel β-sheet with five strands forming a barrel-like fold, while the eTud domains include an additional Oligonucleotide/oligosaccharide-Binding fold domain15. Both Tudor and eTud domains are known to bind predominantly to methylated lysine or arginine residues. In addition, five RNA helicases, such as Vasa (Vas) and the fly homolog of Tdrd9, Spn-E, which are essential for piRNA processing, are also included (Table 1). The Vas’s C-terminal region is known to bind to the Lotus domain shared by two nuage components, Tejas (Tej) and Tapas. Spn-E is also recently shown to interact with Tej16. Among those 20 proteins, the Molecular Interaction Search Tool (MIST), a conventional database of protein-protein interactions, registers eight interacting pairs as direct binding, and 28 interactions which are direct or indirect (Table 1, Supplemental Fig S1B, C)17.
Screening for the protein-protein interactions by AlphaFold2
We used AlphaFold2 program to predict the direct protein-protein interaction and 3D structure of the complex. Assuming a 1:1 binding of 20 types of proteins, a total of 400 pairs of dimer predictions were calculated by a supercomputer. The prediction flow of AlphaFold2 consisted of two main parts8. Initially, a multiple sequence alignment was performed for each query protein and stored for the future use. Subsequently, the AlphaFold2 program predicted 3D dimer structures based on the co-evolution inferred from the multiple sequence alignments. For each dimer prediction, five different structure models with varying parameters were generated. Among these, the model with the highest prediction confidence score (pcScore) was selected as the final prediction result. The pcScore is constituted by two evaluations, the overall structure (pTM) and an evaluation of the dimeric interface (ipTM), emphasizing the interface evaluation as represented by the following formula18:
pcScore = 0.8 x ipTM + 0.2 x pTM
These 3 values, pcScore, ipTM and pTM, for each prediction pairs were visualized in the separate heatmaps (Fig. 1A). In general, pcScore and ipTM values showed similar trends although a well-structured protein (e.g. Spn-E) tended to have a higher pTM value, which slightly elevated the pcScore. Based on this, in this study, we used the pcScore as an indicator of the protein-protein interaction. Each heterodimeric pair was calculated twice in the pairwise screening (e.g. proteins A_B and B_A), and the pcScores were plotted (Fig. 1B). The results showed that there was significant variance in the pairs with lower pcScores, while pairs with pcScores above 0.6 had relatively higher reproducibility. Consequently, we set a threshold of 0.6 and considered protein pairs with pcScores above 0.6 as likely complex-forming candidates. This approach identified 13 pairs; seven of these were already known to form complexes, confirming the effectiveness of AlphaFold2 in predicting complex formations (Table 2). The highest pcScore pair was the Zuc homodimer, possibly because AlphaFold2 had learned from Zuc homodimer’s crystal structure registered in the database19. For the remaining 12 pairs, the predicted 3D structures and the Predicted Aligned Error (PAE) plots were shown in Fig. 1C. Consistent with a previous report using silkworm Bombyx mori20, both Argonaute 3 (AGO3) and Aub, members of PIWI-family proteins sharing 50%-60% amino acid sequence similarity, were predicted to form dimers with Maelstrom (Mael) (Fig. 1C-i, ii, Table 2). AGO3 and Aub appeared well-folded protein except for their N-terminal flexible regions. In contrast, Mael protein was divided into three parts: N-terminal HMG domain, middle MAEL domain, and C-terminal disordered region21 (Fig. 1C-i, ii). AlphaFold2 predicted the MAEL domain interacted with AGO3 and Aub.
Me31B, Tral, and Cup are recognized as RNA regulators localized to the nuage and/or sponge body, though they are not directly involved in the piRNA pathway. Previous studies have indicated that these proteins form complexes12,22,23. Me31B is a well-conserved RNA helicase and showed the tightly-folded structure composed of two concatenated RecA helicase domains24. On the other hand, Tral and Cup were predicted largely disordered structure with some secondary structures (Fig. 1C-iii, iv). The predicted dimer structures of Me31B_Tral and Cup_Me31B showed the score of 0.74 and 0.68, respectively (Table 2). Consistent with the previous study23, AlphaFold2 predicted that the FDF motif of Tral, which contains a Phe-Asp-Phe sequence folded into two a-helixes from residue 405 to 537, was associated with Me31B. (Fig. 1C-iii). In addition, an α-helix and loop regions of Cup were predicted to make a contact with Me31B (Fig. 1C-iv). BoYb and Vret, both are eTud domain containing proteins25 and their direct interaction has been suggested by the high retrieval rate for BoYb in the immunoprecipitant of Vret from the ovary26. The predicted structure revealed that both BoYb and Vret proteins consist of two domains, one at the N-terminal and the other at the C-terminal, connected by a flexible region. (Fig. 1C-v). Interactions were predicted between their N-terminal domains and between C-terminal domains, respectively. It has been reported that Tej, known as Tdrd5 in mammal, binds directly to Vas through its N-terminal Lotus domain27 (Fig. 1C-vi) and to Spn-E through its loop region continuing the eTud domain16 (Fig. 1C-vii). The predicted structures of Tej_Vas and Spn-E_Tej were consistent to their binding properties reported previously.
The remaining five pairs, previously unreported as directly interacting, were considered novel binding pairs (Table 2, Fig. 1C-viii-xii). These interactions were experimentally examined using Drosophila S2 culture cells derived from embryonic somatic cells that lack germline-specific proteins. Previously, Squ was co-immunoprecipitated with Spn-E along with other nuage components from ovarian lysate28, but whether this interaction was direct had not been examined. Co-immunoprecipitation assay in S2 cells, Myc-Spn-E was strongly detected in the precipitant of Flag-Squ by Western blotting, possibly supporting the direct interaction between Spn-E and Squ in the S2 cells devoid of germline proteins (Fig. 1D-i). Similarly, AlphaFold2 predicted a direct interaction between Aub and Vret, which was corroborated by co-immunoprecipitation assays (Fig. 1D-ii). The binding capabilities of another pair, BoYb-Shutdown (Shu), were also confirmed in S2 cells (Fig. 1D-iv). Three out of five candidate pairs confirmed interactions, validating the effectiveness of AlphaFold2 in identifying the binding partners. However, BoYb-Spn-E and Me31B-Vret did not show interaction in these assays (Fig. 1D-iii, v), possibly suggesting weak interactions that co-immunoprecipitation may have failed to detect.
Evaluation of Spn-E and Squ interaction in culture cells and ovaries
Among the binding candidates, we focused on the predicted dimer structure of Spn-E and Squ pair. Spn-E is an evolutionarily conserved RNA helicase which is expressed in germline cells. It plays a crucial role in the piRNA production and transposon suppression in germline cells28,29. Similarly, Squ is also expressed in ovary and testis and involved in the piRNA production, although its molecular role is less defined29,30. While squ is conserved across Drosophila species (Supplemental Fig. 2A, B), vertebrate orthologs remain unidentified. Spn-E contains four domains: DEAD/DEAH helicase, Hel-C, HA2, and eTud domains (Fig. 2A). Its predicted 3D structure was well folded and contained few flexible regions (Fig. 1C-viii). In contrast, Squ was predicted to be largely disordered, consisting of three α-helices and two β-strands (Fig. 2A). The middle parts of Squ were in close contact with Spn-E, showing lower PAE values, suggestive of their interaction (Fig. 1C-viii, 2A). AlphaFold2 predicts the five structure models for each query using different initial model parameters (models 1-5) and pcScore is given to each model. As for Spn-E_Squ pair, the scScores were ranging from 0.74 to 0.77. The 3D structures of Spn-E were very similar across all five models, superimposing almost perfectly (Fig. 2B). The middle region of Squ was consistently positioned relative to Spn-E, although the N- and C-terminal regions of Squ remained flexible (Fig. 2B).
The closer examination of the Spn-E_Squ dimer interface revealed a short α-helix of Squ (106th-116th residues) fitted into a groove on the Spn-E surface, while the anti-parallel β-sheet (140th-153rd) was also predicted to interact with Spn-E (Fig. 2A, C). Physico-chemical structural analysis using PDBePISA server (EMBL-EBI) identified salt bridges between Spn-E and Squ (Supplemental Table S2, S3)31. To validate these predicted interactions, we generated Squ mutants substituting each residue involved in the four salt bridges (E107, E109, R115, and K163) with alanine (Fig. 2D, Supplemental Fig. S2B) and assessed their interactions by co-immunoprecipitation in S2 cells expressing tagged proteins, Myc-Spn-E and Flag-Squ. The assay revealed that while the E107A single mutation did not affect the interaction, other single mutations mildly reduced the binding affinity of Squ to Spn-E (Supplemental Fig. S3A), Furthermore, the localization of GFP-tagged Squ and mKate2 (mK2)-tagged Spn-E were examined in S2 cells. When only Squ was expressed, it was dispersed in cytosol (Supplemental Fig. S3B). On the other hand, when only Spn-E was expressed, it localized in the nucleus as reported previously16. In the co-expression of Squ wildtype or single mutants, Spn-E was moved to the cytoplasm and form granules together with Squ, suggesting the interaction between them. Although the single mutants still could bind to Spn-E, Squ quadruple mutant (Squ4A) completely lost the binding (Fig. 2E) and did not show the co-localization with Spn-E in S2 cells (Supplemental Fig. S3B). These results suggest that the salt bridges are important for the interaction between Spn-E and Squ and support the accuracy of their dimer structure predicted by AlphaFold2.
While the RNA binding site of Spn-E has not been extensively studied, it is presumed to be near the helicase domain, similar to the Vas helicase-RNA complex32. In addition, Lin et al demonstrated that Hel-C domain of Spn-E interacted with the Tej’s eSRS region, which recruits Spn-E to nuage16, a site distinct from the predicted Squ binding sites (Fig. 2A). Interestingly, a tetramer complex of Spn-E_Squ_Tej_RNA predicted by the recently available AlphaFold333 placed the single strand RNA (ssRNA) near Spn-E’s helicase domain (Fig. 2F), aligning with the ssRNA binding position found in Vas (Supplemental Fig. S3C). The predicted tetramer model suggests that Squ binding to Spn-E does not inhibit but may potentially regulate Spn-E’s interaction with Tej or RNA by stabilizing the domain orientation of Spn-E (Fig. 2F).
We investigated whether Spn-E also interacts with Squ within the Drosophila ovary. The antibody against Squ detected a specific band at the expected size by Western blotting in the heterozygous control ovarian lysate, which was absent in the transheterozygote mutant, squPP32/HE47 (Fig. 3A)30. Consistent with the previous report conducted with the transgenic line expressing HA-Squ30, immunostaining of ovaries revealed the Squ’s localization in nuage, which overlaps with endogenously-tagged Spn-E with mK2 (Fig. 3B). Spn-E was co-immunoprecipitated together with Squ from ovarian lysate, indicating the interaction between Squ and Spn-E (Fig. 3C). While the previous mass spectrometry analysis detected PIWI family proteins, Piwi, Aub, and AGO3, in Spn-E immunoprecipitates28, these three proteins were not present in the immunoprecipitant of Squ (Fig. 3C), further supporting the direct interaction between Squ and Spn-E.
Screening oogenesis-related proteins for interaction with nuage proteins
Given the role of nuage for piRNA biogenesis and germline development, interactions between nuage-localized proteins and those involved in oogenesis were expected. We employed AlphaFold2 to predict these interactions using Vas, Squ, and Tej, the representative nuage components yet remain elusive, as baits. Of 430 proteins in oogenesis pathway34, dimeric binding of 1,290 pairs was predicted (Supplemental Table S4), with 18 pairs showing dimer structures scoring above 0.6 (Table 3). Among those, co-immunoprecipitation in S2 cells confirmed interactions of three pairs, Mei-W68_Squ, CSN3_Squ, and Pka-C1_Tej (Fig. 4A, B, Table 3). The Mei-W68_Squ dimer, scoring 0.63, the binding site of Squ to MeiW68 was predicted at α-helixes in its middle region, which overlaps with the interacting site to Spn-E (Table 3, Fig 4A-i, compare with Fig.1C-viii). Mei-W68 is a topoisomerase, known as Spo11 in many organisms, which is required for the formation of double strand breaks during meiosis35.
Interestingly, Squ also plays a role in DNA damage response pathway and showed the genetic interaction with chk2, a meiotic checkpoint gene30. These results suggest that the binding of Squ to Mei-W68 may regulate the enzymatic activity of Mei-W68 in order to suppress the excessive formation of double-strand breaks. Another confirmed pair was CSN3_Squ pair scoring 0.62 (Fig. 4A-ii, B-ii). CSN3, a component of COP9 signalosome which removes Nedd8 modifications from target proteins, is required for the self-renewal of the germline stem cells36. Pka-C1, a cAMP-dependent protein kinase involved in axis specification, rhythmic behavior and synaptic transmission37 and predicted to bind with the N-terminal Lotus domain of Tej (Score 0.64, Fig. 4A-iii, B-iii), which is also known as binding site to Vas27. This suggests a potential competitive interaction between Pka-C1 and Vas for Tej. Although the success rate of confirmed interactions was low (3 out of 18) (Table 3, Supplemental Fig. S4), the results indicate that these protein pairs could interact within cells if co-expressed in vivo.
Screening all Drosophila proteins for Piwi-interacting proteins
Given the crucial role of Piwi in piRNA biogenesis, heterochromatin formation, and germline stem cell (GSC) maintenance, we employed AlfaFold2 to screen all proteins in Drosophila melanogaster for potential Piwi interactions. Piwi, the founder member of the PIWI family proteins, is not only essential for binding piRNAs and regulating complementary mRNAs but also plays a critical role in GSC self-renewal38. Studies have shown that Piwi, lacking the N-terminal moiety containing the nuclear localization signal (NLS), still retains GSC self-renewalcapabilities. Its function in GSC self-renewal is realized independently in the cytoplasm of GSC niche cells, separate from its role in transposon repression. The crystal structures of Drosophila Piwi and silkworm Siwi have been solved and revealed the organization of four domains (N, PAZ, MID, and PIWI)39,40. Recently, the ternary structure of piRNA, target RNA, and MILI, a mouse ortholog of Piwi, has been reported and the bound piRNA threaded through the channel between N-PAZ and MID–PIWI lobes (Supplemental Fig. S5A)41.
To identify novel Piwi-binding proteins, we conducted a 1:1 interaction screening involving approximately 12,000 Drosophila proteins, excluding any proteins over 2,000 amino acid residues due to the computational limits. The pcScores by AlphaFold2 were primarily low, with over 98% being below 0.6, suggesting a low likelihood of interaction between Piwi and the vast majority of the proteins (Fig. 5A). Approximately 1.5% of the pairs, totaling 164 pairs, scored above 0.6, was expected to contain the novel binding partners (Supplemental Table S5). Top 24 candidates with greater than 0.75 pcScore were listed in Table 4. This list contained many metabolic enzymes and three piRNA-related proteins, Asterix (Arx), Mael, and Hen1. The interactions between Mael and Piwi-family proteins have been already reported20. Arx, known as Gtsf1 in mammals and integral to Piwi–piRISC-mediated transcriptional silencing in nucleus42, had high pcScores (0.83, Table 4). Despite its known three-dimensional structure determined by NMR spectroscopy43, the Arx_Piwi complex structure remained elusive. AlphaFoldF2 predicted that while Arx lacked a compact domain, the majority of Arx protein associated around the PIWI domain, except for the flexible C-terminal region (130th-167th residues) (Fig. 5B-i). Three Arx paralogs in Drosophila (CG34283, CG32625, and CG14036) were also predicted to bind to Piwi with high pcScores, suggesting their interactions within the cells (Supplemental Fig. S5B). Although CG34283 is not expressed, CG32625 and CG14036 are moderately and highly expressed in ovary, respectively37. However, unlike arx, knockdown of each paralogous gene did not result in de-repression of a transposon, mdg142, suggesting that they may be pseudogenes or possess redundant roles.
Hen1 is a methyltransferase known to mediate methylation of the terminal 2’ hydroxyl group of small interfering RNAs and piRNAs, thereby enhancing the stability of the small RNAs. Consistent with the previous report showing Hen1 binding to Piwi42, the dimer structure of Hen1_Piwi was predicted with high pcScore, 0.77. This prediction further suggests that Hen1 is recruited to Piwi, thereby positioning it closer to the piRNA substrate (Fig. 5B-ii). Another potential interacting protein for Piwi was CG33703, a protein whose functions remains uncharacterized despite having 75 paralogs listed in Drosophila genome37. Together with three of these paralogs (CG33783, CG33647, and CG33644), CG33703 was predicted to form dimer with Piwi (pcScores 0.82) (Table 4, Supplemental Fig. S5C). The domain of unknown function, DUF109144, shared by these paralogs was predicted to associate with the PIWI-domain (Fig. 5B-iii). Although these proteins are generally not expressed under the normal conditions37, their potential to bind Piwi suggests a regulatory role in the abnormal or stress conditions where CG33703 or its paralogs are expressed. In addition, we investigated two oogenesis-related proteins, Twinfilin (Twf, pcScore 0.64, Fig. 5B-iv) and Brainiac (Brn, pcScore 0.63, Fig. 5B-v), for their binding with Piwi through co-immunoprecipitation (Fig. 5C, Supplemental Table S5). While no binding was observed with Twf, significant binding was detected with Brn, which is involved in dorsal-ventral polarity determination in follicle cells45.
In this study, we have identified several potential partners for novel protein interactions, though the physiological relevance of these pairs remains to be elucidated. The expression patterns of these candidate proteins within the organism are crucial for further validation of our findings. It is likely that these proteins interact when co-expressed in the same cellular context. Under typical growth conditions, these interactions might not occur; however, in stress or disease states where these proteins are upregulated, the likelihood of interaction increases, potentially implicating these interactions in the disruption of normal cellular functions and contributing to disease or tumorigenesis. Furthermore, in silico screening proves extremely valuable, especially when dealing with toxic bait proteins, as it allows us to narrow down the list of potential candidates and reduce the need for hazardous experimental procedures. Ultimately, establishing these potential interactions in vivo could significantly advance our understanding of protein functions under both normal and pathological conditions.
Materials and Methods
Antibodies
The anti-Squ antibody was generated as follows. His-tagged full-length Squ was expressed in Escherichia coli BL21(DE3) strain, with the plasmid that subcloned the squ coding region into pDEST17 vector (Thermo Fisher Scientific). His-Squ was solubilized with 6 M Urea in PBS, purified using Nickel Sepharose beads (GE healthcare) following the manufacturer’s protocol, and subsequently used for immunization in rats. The antibodies used for Western blotting analysis were rat anti-Spn-E16 (1:500), rat anti-Ago316 (1:200), guinea pig anti-Aub46 (1:1000), mouse monoclonal anti-Piwi (G-1, sc-390946, Santa Cruz Biotechnology (United States)), and mouse monoclonal anti-α-Tubulin (DM1A, sc-32293, Santa Cruz Biotechnology). The secondary antibodies used in this study were HRP-conjugated goat anti-guinea pig (Dako, Cat.# P0141), HRP-conjugated goat anti-rat (Dako, Cat.# P0450), HRP-conjugated goat anti-mouse (BioRad, Cat.# 1706516) and HRP-conjugated goat anti-rabbit (BioRad, Cat.# 1706515). HRP-conjugated anti-DDDDK-tag antibody (MBL, Cat.#M185-7) and HRP-conjugated anti-Myc-tag antibody (MBL, Cat.#M192-7) were used to detect FLAG-tagged and Myc-tagged proteins, respectively.
AlphaFold2 prediction for the direct interacting protein pairs
Amino acid sequences for Drosophila proteins were obtained from Flybase37. For proteins annotated with multiple isoforms, only the longest isoform was selected. Proteins exceeding 2,000 residues were excluded due to computational limitations. AlphaFold v2.2 program was installed in the Supercomputer for Quest to Unsolved Interdisciplinary Datascience (SQUID) at Cyber Media Center in Osaka University. All necessary protein sequence databases for AlphaFold2 were stored on an SSD device connected to the SQUID system.
The AlphaFold2 prediction process was divided into two steps: generation of the multiple sequence alignment (MSA) and the prediction of the 3D structure. The MSAs were computed on SQUID’s CPU node and stored for reuse. For dimer structure prediction, two MSAs corresponding to the dimer pair were placed in the directory of msas/A and msas/B. The calculations were performed on the GPU node with the options of -t 2022-05-14 -m multimer -l 1 -p true. AlphaFold2 generates five structural models for each prediction. To speed up the prediction, five computations were assigned to five GPU units, even though the original AlphaFold2 program computes 5 models one at a time. The prediction confidence score (pcScore) was provided for each model and among 5 models, the highest pcScore was used as the prediction score for the corresponding dimer structure. PAE plots for dimer structures were drawn by extracting the data form pkl files generated by AlphaFold2. The list of protein pairs scoring above 0.6 and the corresponding PAE plots and PDB structures are available on Github (https://dme-research.github.io/AF2_2/).
AlphaFold3 prediction for the RNA-containing complex structure
The structure of Spn-E_Squ_Tej complexed with RNA, 5’-CUGACUACCGAAGUACUACG-3’, was predicted by the AlphaFold3 prediction server (https://golgi.sandbox.google.com/)33.
Analysis of protein 3D structure
The protein 3D structure was visualized using ChimeraX software47. The SpnE_Squ dimer interface was analyzed with the ‘Protein interfaces, surfaces and assemblies’ service (PISA) at the European Bioinformatics Institute (http://www.ebi.ac.uk/pdbe/prot_int/pistart.html)31.
Fly stocks
All stocks were maintained at 25L with standard methods. Mutant alleles of squ (squpp32 and squHE47) were used in this study30. The mK2-tagged Spn-E-mK2 knock-in fly was previously generated16. y w strain served as the control.
Western blotting
Ovaries were homogenized in the ice-cold PBS and denatured in the presence of SDS sample buffer at 95°C for 5 min. The samples were then subjected to SDS-PAGE and transferred to ClearTrans SP PVDF membrane (Wako). The primary and secondary antibodies described above were diluted in the Signal Enhancer reagent HIKARI (Nacalai Tesque). Chemiluminescence was induced by the Chemi-Lumi One reagent kit (Nacalai Tesque) and detected with ChemiDoc Touch (Bio-Rad). The bands were quantified using ImageJ48 or Image Lab software (Bio-Rad).
Co-immunoprecipitation in S2 cells
Drosophila Schneider S2 cells were cultured at 28°C in Schneider’s medium supplemented with 10% (v/v) fetal bovine serum and antibiotics (penicillin and streptomycin). Protein coding regions were cloned into pENTR vector (Thermo Fisher Scientific) and then transferred into pAFW or pAMW destination vectors. S2 cells (0.2-2x106 cells/ml) were seeded in 12-well plates overnight and transfected using Hilymax (Dojindo Molecular Technologies, Japan). After 36-48 hours, S2 cells were resuspended in 360 μl of ice-cold PBS containing 0.02% Triton-X100 and 1x protease inhibitor cocktail (Roche), and sonicated (0.5 sec, 5 times). The resulted lysate was clarified by spinning at 15,000 xg for 15 min at 4°C. 300 μl of supernatant was incubated with 6 μl of prewashed anti-FLAG magnetic beads (MBL) or anti-Myc magnetic beads (Thermo Fisher Scientific) for 1.5 h at 4°C with gentle rotation. After incubation, the beads were washed three times with 800 μl of ice-cold PBS with 0.02% Triton-X100, denatured in SDS sample buffer and subjected to SDS-PAGE and Western blot. 1% of the total lysates were loaded as input samples.
Co-localization assay in S2 cells
Construction of GFP-tagged or mKate2-tagged proteins and transfection were conducted as described in the previous section. After 48 h of transfection, the cells were placed onto the concanavalin A-coated coverslips for 20 min, fixed with PBS containing 4% (w/v) paraformaldehyde for 15 min at room temperature, permeabilized with PBX [PBS containing 0.2% (v/v) TritonX-100] for 10 min twice, stained with DAPI (1:1000) and mounted with Fluoro-Keeper Antifade Reagent (Nacalai Tesque). Images were taken by ZEISS LSM 900 with Airy Scan 2 using 63x oil NA 6.0 objectives and processed using ZEISS ZEN 3.0 and ImageJ48.
Crosslinking immunoprecipitation (CL-IP)
As previously described16, 100 ovaries from y w flies were dissected in ice-cold PBS and fixed in PBS containing 0.1% (w/v) paraformaldehyde for 20 min on ice, quenched in 125 mM glycine for 20 min, and then homogenized in CL-IP lysis buffer. The lysate was incubated at 4°C for 20 min and then sonicated. After centrifugation at maximum speed for 10 min at 4°C, the supernatant was collected and diluted with an equal volume of CL-IP wash buffer. 10 μl of pre-washed Dynabeads Protein G/A mixture (1:1) (Invitrogen) was added for pre-clearance at 4°C for 1 h. Anti-Squ antibody was added to the cleared supernatant with 1:500 dilution and incubated at 4°C overnight. The 20 μl of pre-washed Dynabeads Protein G/A 1:1 mixture beads (Invitrogen) were added for binding and incubated at 4°C for 3 h. After washed with CL-IP wash buffer for 3 times, beads were collected and 50 μl of CL-IP wash buffer containing SDS sample buffer was added. The beads were boiled at 95°C for 5 min and subjected for SDS-PAGE and Western blotting analysis.
Immunostaining of ovaries
As previously described16,46, ovaries were dissected, fixed, permeabilized with PBX and immunostained. The primary and the secondary antibodies were anti-Squ antibody (in this study, 1:500) and Alexa Fluor 488-conjugated anti-rat IgG (Thermo Fisher Scientific, 1:200). Images were taken by ZEISS LSM 900 with Airy Scan 2 using 63X oil NA 1.4 objectives and processed by ZEISS ZEN 3.0 and ImageJ48.
Data availability statement
PDB files and PAE plots for the protein dimers whose pcScores were more than 0.6, were deposited and available at GitHub Pages (https://dme-research.github.io/AF2_2/)
Acknowledgements
The prediction by AlphaFold2 was achieved through the use of large-scale computer systems, Supercomputer for Quest to Unsolved Interdisciplinary Datascience (SQUID) at the Cybermedia Center, Osaka University through the Research Proposal-based Use, Large-Scale High-Performance Computing Projects to K.S. (Cyber media center, Osaka University) and the High Performance Computing Infrastructure (HPCI) System Research Project (Project ID: hp240099 to K.S.). We thank Dr Trudi Schüpbach (Princeton University) for generous gifts of squ mutant flies. We also thank the FBS Core Facility in Osaka University for providing access to the LSM 900 and ChemiDoc Touch. We appreciate the insightful discussion and suggestions from all the members of KT’s laboratory.
Additional information
Author contributions
Conceptualization: S.K.;
Methodology: S.K., X.X., S.T., D.S.;
Software: S.K., S.T., Y.K., K.R., S.R., D.S.;
Validation: S.K.;
Formal analysis: S.K., X.X.;
Investigation: S.K., X.X.;
Resources: S.K., T.K.;
Data curation: S.K.;
Writing - original draft: S.K., X.X., T.K.;
Visualization: S.K., X.X.;
Supervision: S.K., T.K.;
Project administration: S.K., T.K.;
Funding acquisition: S.K., T.K.
Competing Interests Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Funding Statement
This work was supported by TAKEDA Bioscience Research Grant [J191503009 to K.T.]; Grant-in-Aid for Transformative Research Areas (A) [21H05275 to K.T.]; and Osaka University Institute for Datability Science “Transdisciplinary Research Project” [Na22990007 to K.T. and K.S.].
References
- 1.COG database update: focus on microbial diversity, model organisms, and widespread pathogensNucleic Acids Res 49:D274–D281
- 2.High mobility of proteins in the mammalian cell nucleusNature 404:604–609
- 3.Biomolecular interactions modulate macromolecular structure and dynamics in atomistic model of a bacterial cytoplasmeLife 5
- 4.Functional organization of the yeast proteome by systematic analysis of protein complexesNature 415:141–147
- 5.A reference map of the human binary protein interactomeNature 580:402–408
- 6.Principles of Protein−Protein Interactions: What are the Preferred Ways For Proteins To Interact?Chem. Rev 108:1225–1244
- 7.RCSB Protein Data Bank (RCSBorg): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res 51:D488–D508
- 8.Highly accurate protein structure prediction with AlphaFoldNature 596:583–589
- 9.Computed structures of core eukaryotic protein complexesScience 374
- 10.piRNA pathway and the potential processing site, the nuage, in the Drosophila germlineDev. Growth Differ 54:66–77
- 11.PIWI proteins and PIWI–interacting RNAs in the somaNature 505:353–359
- 12.Comparative Proteomics Reveal Me31B’s Interactome Dynamics, Expression Regulation, and Assembly Mechanism into Germ Granules during Drosophila Germline DevelopmentSci. Rep 10
- 13.Unique germ-line organelle, nuage, functions to repress selfish genetic elements in Drosophila melanogasterProc. Natl. Acad. Sci. U. S. A 104:6714–6719
- 14.AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequencesNucleic Acids Res 52:D368–D375
- 15.Structure and domain organization of Drosophila TudorCell Res 24:1146–1149
- 16.Tejas functions as a core component in nuage assembly and precursor processing in Drosophila piRNA biogenesisJ. Cell Biol 222
- 17.Molecular Interaction Search Tool (MIST): an integrated resource for mining gene and protein interaction dataNucleic Acids Res 46:D567–D574
- 18.Protein complex prediction with AlphaFold-MultimerbioRxiv https://doi.org/10.1101/2021.10.04.463034
- 19.Structure and function of Zucchini endoribonuclease in piRNA biogenesisNature 491:284–287
- 20.Maelstrom functions in the production of Siwi-piRISC capable of regulating transposons in Bombyx germ cellsiScience 25
- 21.Crystal Structure and Activity of the Endoribonuclease Domain of the piRNA Pathway Factor MaelstromCell Rep 11:366–375
- 22.Structural Basis for the Mutually Exclusive Anchoring of P Body Components EDC3 and Tral to the DEAD Box Protein DDX6/Me31BMol. Cell 33:661–668
- 23.Similar Modes of Interaction Enable Trailer Hitch and EDC3 To Associate with DCP1 and Me31B in Distinct Protein ComplexesMol. Cell. Biol 28:6695–6708
- 24.Molecular basis for GIGYF–Me31B complex assembly in 4EHP-mediated translational repressionGenes Dev 33:1355–1360
- 25.Deciphering arginine methylation: Tudor tells the taleNat. Rev. Mol. Cell Biol 12:629–642
- 26.A systematic analysis of Drosophila TUDOR domain-containing proteins identifies Vreteno and the Tdrd12 family as essential primary piRNA pathway factorsEMBO J 30:3977–3993
- 27.The LOTUS domain is a conserved DEAD-box RNA helicase regulator essential for the recruitment of Vasa to the germ plasm and nuageGenes Dev 31:939–952
- 28.Spindle-E cycling between nuage and cytoplasm is controlled by Qin and PIWI proteinsJ. Cell Biol 213:201–211
- 29.A Transcriptome-wide RNAi Screen in the Drosophila Ovary Reveals Factors of the Germline piRNA PathwayMol. Cell 50:749–761
- 30.Zucchini and squash encode two putative nucleases required for rasiRNA production in the Drosophila germlineDev. Cell 12:851–862
- 31.Inference of Macromolecular Assemblies from Crystalline StateJ. Mol. Biol 372:774–797
- 32.Structural Basis for RNA Unwinding by the DEAD-Box Protein Drosophila VasaCell 125:287–300
- 33.Accurate structure prediction of biomolecular interactions with AlphaFold 3Nature 630:493–500
- 34.The Gene Ontology knowledgebase in 2023Genetics 224
- 35.. mei-W68 in Drosophila melanogaster encodes a Spo11 homolog: evidence that the mechanism for initiating meiotic recombination is conservedGenes Dev 12:2932–2942
- 36.Protein competition switches the function of COP9 from self-renewal to differentiationNature 514:233–236
- 37.FlyBase: updates to the Drosophila genes and genomes databaseGenetics 227
- 38.Separation of stem cell maintenance and transposon silencing functions of Piwi proteinProc. Natl. Acad. Sci. U. S. A 108:18760–18765
- 39.Crystal Structure of Silkworm PIWI-Clade Argonaute Siwi Bound to piRNACell 167:484–497
- 40.Crystal structure of Drosophila PiwiNat. Commun 11
- 41.Mammalian PIWI–piRNA–target complexes reveal features for broad and efficient target silencingNat. Struct. Mol. Biol :1–10https://doi.org/10.1038/s41594-024-01287-6
- 42.DmGTSF1 is necessary for Piwi–piRISC-mediated transcriptional transposon silencing in the Drosophila ovaryGenes & Dev. 27:1656–1661https://doi.org/10.1101/gad.221515.113
- 43.Asterix/Gtsf1 links tRNAs and piRNA silencing of retrotransposonsCell Rep 34
- 44.SMART: recent updates, new developments and status in 2020Nucleic Acids Res 49:D458–D460
- 45.The neurogenic genes egghead and brainiac define a novel signaling pathway essential for epithelial morphogenesis during Drosophila oogenesisDevelopment 122:3863–3879
- 46.The Tudor Domain-Containing Protein, Kotsubu (CG9925), Localizes to the Nuage and Functions in piRNA Biogenesis in D. melanogasterFront. Mol. Biosci. 9
- 47.UCSF ChimeraX: Structure visualization for researchers, educators, and developersProtein Sci. Publ. Protein Soc 30:70–82
- 48.NIH Image to ImageJ: 25 years of image analysisNat. Methods 9:671–675
- 49.The Tudor domain protein Tapas, a homolog of the vertebrate Tdrd7, functions in the piRNA pathway to regulate retrotransposons in germline of Drosophila melanogasterBMC Biol 12
- 50.Aub and Ago3 are recruited to nuage through two mechanisms to form a ping-pong complex assembled by KrimperMol. Cell 59:564–575
- 51.PAPI, a novel TUDOR-domain protein, complexes with AGO3, ME31B and TRAL in the nuage to silence transpositionDev. Camb. Engl 138:1863–1873
- 52.Belle is a Drosophila DEAD-box protein required for viability and in the germ lineDev. Biol 277:92–101
- 53.In vivo profiling of the Zucchini proximal proteome in the Drosophila ovaryDev. Camb. Engl 150
- 54.The cochaperone shutdown defines a group of biogenesis factors essential for all piRNA populations in DrosophilaMol. Cell 47:954–969
- 55.Repression of Retroelements in Drosophila Germline via piRNA Pathway by the Tudor Domain Protein TejasCurr. Biol 20:724–730
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2024, Shinichi et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 15
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.