The C9orf72 G4C2 hexanucleotide repeat expansion is the most common genetic mutation in amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) [1, 2]. This expansion can be translated into five types of dipeptide repeat proteins (DPRs): polyPR, polyGR, polyGA, polyGP, and polyPA [3]. The positively-charged arginine-containing DPRs (R-DPRs) show the highest levels of toxicity in different cell and animal models [4-10] with polyPR known to be the most toxic DPR [5, 9, 11]. R-DPRs have been linked to a wide variety of cellular defects [12-16], but a growing body of evidence suggests that neurodegenerative diseases, including C9orf72 ALS/FTD (C9-ALS/FTD), may be caused by disruption of nucleocytoplasmic transport (NCT) [17-20]. At the same time, many transport components that play a prominent role in NCT have been identified to function as modifiers of G4C2/DPR toxicity [5, 10, 21].

The regulated trafficking of proteins and RNA between the nucleus and cytoplasm occurs through nuclear pore complexes (NPCs) embedded in the nuclear membrane [22, 23]. The NPC is lined with intrinsically disordered phenylalanine-glycine-rich nucleoporins (FG-Nups) that collectively function as a selective permeability barrier. Small molecules rapidly diffuse through the NPC, but the passage of larger cargoes across the barrier needs to be facilitated by their binding to nuclear transport receptors (NTRs) [24-26]. The β-karyopherin (Kapβ) family is the largest class of NTRs and includes both import and export receptors [27]. Another essential regulator of NCT is the GTPase Ran, a small protein bound to guanosine triphosphate (GTP) in the nucleus and to guanosine diphosphate (GDP) in the cytoplasm [24]. The directionality of NCT is mediated by the RanGTP-RanGDP gradient over the nuclear envelope, which is preserved by the cytoplasmic GTPase-activating protein RanGAP and the nuclear guanine nucleotide exchange factor RanGEF [24, 28].

In the import cycle, importins bind their cargoes directly through a nuclear localization signal (NLS) encoded on a cargo. Importin β1 (Impβ1) can also recruit importin α (Impα) that functions as a cargo-adaptor protein. Impα binds to Impβ1 through its N-terminal importin β-binding (IBB) domain [29, 30]. The importin-NLS-cargo complex then shuttles to the nucleus. The binding of RanGTP to importin in the nucleus disassembles the importin-NLS-cargo complex and the RanGTP-importin complex is recycled to the cytoplasm. When cargo is bound to Impβ1 via Impα, RanGTP dissociates Impβ1 from the Impα-NLS-cargo. This triggers a competition between the flexible IBB domain of Impα and NLS-cargo for binding to Impα, thus facilitating the dissociation of the NLS-cargo. Nucleoporins such as Nup50/Nup2 also catalyze this process by binding to Impα and accelerating the dissociation rate of the cargo-NLS [22]. The specific receptor CAS bound to RanGTP is required to export Impα to the cytoplasm. It has been proposed that CAS first displaces Nup50/Nup2 from Impα after which the RanGTP-CAS-Impα returns to the cytoplasm. The hydrolysis of RanGTP to RanGDP in the cytoplasm by RanGAP disassembles the RanGTP-importin and RanGTP-CAS-Impα complexes [22]. RanGDP is transported back to the nucleus by nuclear transport factor 2 (NTF2) where the RanGDP-NTF2 complex dissociates when RanGEF regenerates RanGTP [22]. In the export cycle, RanGTP promotes the loading of cargoes with nuclear export signal (NES) to the exportin in the nucleus. The resulting RanGTP-exportin-NES-cargo complex moves to the cytoplasm. Once there the complex is disassembled by RanGAP which hydrolyses RanGTP to RanGDP [24].

In a recent study we have analyzed the binding of polyPR to the Kapβ family of importins and exportins [13, 16, 31] by using coarse-grained (CG) molecular dynamics simulations [32]. Depending on its length, polyPR can interact with several cargo-, IBB-, RanGTP-, and FG-Nup-binding sites on the Kapβs [32]. Beside Kapβs, there is evidence for direct binding of Impα isomers with R-DPRs [16]. Some regulators of the Ran cycle are also affected in R-DPR-mediated toxicity, where R-DPRs have been shown to cause mislocalization and abnormal accumulation of RanGAP [33], and mislocalization of Ran and RanGEF in cell culture models [5, 33, 34]. RanGAP and RanGEF also appear to be modifiers of R-DPR toxicity in genetic studies [5, 10, 11]. It is not clear, however, whether these effects arise from a direct interaction of R-DPRs with NCT components. The aim of the current work is to extend the findings of [32] by investigating the interaction between polyPR and various NCT components by means of CG molecular dynamics computations.

Results and discussion

Coarse-grained models of nucleocytoplasmic transport components

We use the residue-scale CG molecular dynamics approach developed and applied earlier to study DPR phase separation [35] and the direct binding of polyPR to numerous members of the Kapβ family [32]. In the present work we investigate the interaction of polyPR with unbound human Impα isomers (Impα1, Impα3, Impα5, Impα7), Ran, CAS (the specific exporter of Impα), RanGEF, and NTF2. We also include KAP60 (homolog of Impα), Cse1 (homolog of CAS), and RanGAP from yeast, since contrary to the human homologs, the crystal structures of the KAP60-Cse1 complex and the RanGAP-RanGppNHp complex (fission yeast RanGAP bound to the non-hydrolyzable form of human RanGTP) are available in the protein data bank. This enables us to investigate a possible polyPR interference with Impα export and the RanGAP function in the model system of yeast. Moreover, yeast has been employed previously to study NCT defects caused by DPRs [5, 36]. More details about the selected transport components can be found in table S2 of the SI.

In our one-bead-per-amino-acid (1BPA) CG models of transport components, each residue is represented by a single bead at the position of the alpha-carbon atom. The overall tertiary structure of the NCT components is preserved through a network of stiff harmonic bonds, and the distribution of charged and aromatic residues are included in the model. The 1BPA force field correlates with experimental findings for polyPR-Kapβs interactions [32]. The CG models of the NCT components studied are shown in figure 1. The Impα isomers contain a flexible N-terminal IBB domain, followed by a helical core that is constructed from 10 Armadillo (ARM) repeats each consisting of three alpha helices. The NLS binding sites are located on the concave surface of the helical core [37]. The IBB domain has an autoinhibitory role and when it is not bound to Impβ1, it binds to the ARM concave core and competes with NLS binding [22]. To simplify the study of possible binding between polyPR and NLS binding sites of Impα, the CG models of Impα isomers are built without their N-terminal IBB domains and referred to as ImpαΔN. This approach enables easier investigation of polyPR’s interaction with NLS binding sites. Similar to the importins and exportins, CAS and Cse1 are constructed from HEAT repeats, with each repeat consisting of two antiparallel α-helices, named A and B, connected by linkers of different lengths [38]. RanGEF is mainly constructed from β-strands and has an overall appearance of a seven-bladed propeller with each blade consisting of a four-stranded antiparallel β-sheet [39]. The RanGAP model of fission yeast used in our study is constructed from eleven leucine-rich repeats (LRRs) forming a symmetric crescent followed by a highly negatively-charged C-terminal region. Each LRR motif consists of a β strand-α helix hairpin unit [40]. NTF2 is a homodimer with each monomer consisting of a β-sheet and three α-helices [41]. The CG model of Ran is based on the nucleotide-free state of the molecule. More details about the CG models and forcefield are provided in the Methods section and section 1 of the SI.

Coarse-grained models of polyPR and transport components used in this study

One-bead-per-amino acid (1BPA) representation of PR7, 20 and the various transport components modeled in the current study. These are several members of the Impα family (excluding the N-terminal IBB domain), specific exporters of Impα (CAS and Cse1), RanGAP, RanGEF, NTF2, and Ran. Details regarding the CG models and protein sequences are listed in tables S2 and S3.

PolyPR binds to several transport components through electrostatic interactions

In investigating the direct interaction with the transport components shown in figure 1 we include the potential effect of polyPR length, as various studies have found that the repeat length of DPRs strongly correlates with toxicity [4, 42-44]. Simulations of polyPR with varying numbers of repeat units, i.e., PR7, PR20, and PR50, are performed at two salt concentrations: 200 mM, similar to previous in vitro experiments performed for Kapβ importins and Impα isomers interacting with R-DPRs [16], and a lower ion concentration of 100 mM to study the effect of salt concentration.

To quantify the interaction between polyPR and the transport components, we calculate the time-averaged number of contacts Ctusing a cutoff of 1 nm. The number of contacts is normalized by the sequence length of the transport components (NTC) and by the polyPR length (NPR). The normalized number of contacts is plotted against the charge parameter

where the net charge per residue NCPR is the total charge of the transport component (in units of elementary charge e) divided by its sequence length, M (in units of e · nm) is the time-averaged total dipole moment, and Rg (in units of nm) is the time-averaged radius of gyration of the transport components in isolation. The dimensionless parameter f is a free parameter which is calculated to be 0.0036 for the best linear fit in figures 2a for PR20 and PR50, based on the quality of fit (R2), see figure S1. The linear correlation observed in figure 2a confirms an electrostatically driven interaction between polyPR and the transport components. Moreover, it highlights the importance of the spatial distribution of charge over the transport components, as characterized here through the dipole moment. CAS and Cse1, the specific exporters of Impα in human and yeast, respectively, are constructed from HEAT repeats and have a super-helical conformation similar to the Kapβs studied before [32]. We therefore present the results for these two cases jointly with the results for the Kapβ set (taken from [32]) in figure 2b. For this set, the best fit is obtained for f = 0 (see figure S1), which indicates the dominant role of NCPR for the number of contacts between polyPR and the Kapβs, CAS and Cse1. We attribute this behavior to the structural characteristics of Kapβs, particularly the superhelical structure which features inner and outer surfaces with differing charge distributions. Importantly, this structural arrangement creates an inner surface characterized by a strong negative electrostatic potential. As demonstrated in our previous work, polyPR predominantly binds to this negatively charged cavity within Kapβs. Consequently, the separation of charges on the Kapβ surface becomes less influential compared to the overall charge.

The transport component,s net charge per residue and dipole moment, together with polyPR length, affect polyPR interaction with various nuclear transport components

(a) and (b) show the normalized time-averaged number of contacts Ct for the interaction between polyPR with 7, 20, and 50 repeat units with different types of transport components. The results are shown for monovalent salt concentrations of Csalt = 200 mM (left panels) and Csalt = 100 mM (right panels). Subfigure (a) shows the results for the transport components shown in figure 1, excluding the specific exporters of Impα: CAS and Cse1. A linear correlation is observed between the normalized Ct and NCPR − fM/Rg with f calculated to be 0.0036 for the best fit. The net charge per residue NCPR is in units of elementary charge e, the dipole moment M is in units of e · nm, and the radius of gyration Rg is in units of nm. Subfigure (b) shows the results for the Kapβ data set (data points taken from [32]) together with CAS and Cse1. For this case, a linear correlation between the normalized Ct and NCPR is observed. The dashed lines show linear fits for PR20 and PR50, see table S4 for the linear equations of the fits. The fits for PR7 resulted in R2 values of 0.89 (a) and 0.83 (b) for 200M and of 0.7 (a) and 0.59 (b) for 100 mM. Because of the low R2 values for 100 mM, the fits for PR7 are not shown. The error bars denote the standard error of the mean. Where error bars are invisible, they are smaller than the marker size.

As can be seen in figure 2a, RanGAP features a much larger number of contacts with polyPR than the other transport components, which can be related to the higher negative net charge and the higher dipole moment of this molecule (see the dipole moments M and NCPR of the transport components in figures S2 and S3 of the SI). In contrast, polyPR makes a negligible number of contacts with Ran since it has a relatively low dipole moment and no net charge. For NTF2 the value of NCPR is − 0.047 e, comparable to several members of the ImpαΔN family, but the lower dipole moment of NTF2 results in a lower number of contacts with polyPR. The binding of PR20 to the ImpαΔN isomers is consistent with the Impα1 and Impα3 binding to R-DPRs found in experiment [16], despite the fact that the N-terminal IBB domains of Impα1,3 are excluded from our CG models. At 200 mM salt concentration, polyPR does not bind to NTF2 and Ran. PolyPR contact with RanGEF is also very low at this salt concentration. Reducing the salt concentration to 100 mM, increases the number of contacts, clearly indicating that electrostatic force is the main driver for binding. At this lower salt concentration, PR7, PR20, and PR50 make contact with RanGEF, but contact with NTF2 is only observed for longer polyPR chains.

For transport components with higher absolute values of NCPR and NCPRfM/Rg, the number of contacts increases with increasing polyPR length. However, as can be seen in figure 2, the number of contacts per unit length of polyPR is often seen to be lower for longer polyPRs especially for the lower salt concentrations where polyPR strongly binds to certain transport components, see e.g. the results in figure 2a for the ImpαΔN family and RanGAP at 100 mM salt concentration. This is due to the fact that most of the residues make contact with the target protein for shorter polyPRs, while for longer polyPRs only some parts of the chain are in contact with the transport components and other regions make less or no contact.

PolyPR interacts with important binding sites of transport components

In order to gain a better understanding of how polyPR interacts with each transport component, we examined the polyPR contact probability of each residue in the sequence of the transport component. The contact probability of each residue is defined as the probability of having at least one polyPR residue within its 1 nm proximity. Figure 3a reveals our results of the interaction between PR7 and PR50 with Impα1ΔN, KAP60ΔN, Cse1, RanGAP, RanGEF and NTF2 (for the other transport components, see figure S4). These findings indicate that a longer polyPR makes contact with a larger number of sites, and also exhibits a higher contact probability with individual residues compared to a shorter polyPR. We also observe that some regions at the C-terminal ends of the ImpαΔN family and RanGAP are permanently bound to polyPR. We also compare the polyPR binding sites with known binding sites of transport components (according to the PiSITE webserver [45]), highlighting them at the bottom of each subfigure. Figure 3b displays the number of contact residues shared between polyPR and the native binding partners of each transport component. As expected, the general trend is that the number of shared binding sites increases with increasing polyPR length. PolyPR interacts with the ImpαΔN family at several known cargo-NLS, Nup50/Nup2, and Cse1 binding sites. Longer polyPRs exhibit a significantly stronger interaction with cargo-NLS binding sites of ImpαΔN compared to shorter polyPRs, which only interact with a limited number of sites. However, we observe that both short and long polyPRs bind to Nup50/Nup2, Cse1, and RanGTP binding sites particularly those located near the C-terminal end of ImpαΔN. In the case of Cse1, we observe polyPR binding to Impα and RanGTP binding sites. PolyPR interacts with the known RanGTP binding sites of RanGAP. We also show that polyPR is able to bind to the highly negatively-charged region in the C-terminal domain of RanGAP that follows the LRR domain. It has been suggested that this negatively-charged region is in close proximity to a positively-charged region in Ran (in the complex formed by Ran and RanGAP) and plays a role in RanGTP hydrolysis [46, 47]. Unfortunately, there is no crystal structure for this region in the PDB structure and thus the binding sites are not known. In the case of RanGEF, a longer polyPR interacts with a high percentage of the known Ran binding sites. For NTF2 we observe polyPR interaction with RanGDP binding sites. PolyPR makes negligible contacts with nucleotide-free state of Ran. For CAS the binding sites are not known. Therefore, these two cases are excluded from figure 3b.

PolyPR interacts with several known binding sites of nuclear transport components in a length-dependent manner

(a) The contact probability for each residue in the sequence of transport components interacting with polyPR. The plot displays the contact probability for six transport components: Impα1ΔN, KAP60ΔN, Cse1, RanGAP, RanGEF and NTF2 at a salt concentration of 100 mM. Results for Impα3ΔN, Impα5ΔN, Impα7ΔN, CAS, and Ran are shown in figure S4. Each figure shows two curves for PR7 and PR50. The bottom part of each figure shows the binding sites for NLS-cargo, Impα, CAS/Cse1, RanGTP, and Nup50/Nup2 using different colors. These binding sites are obtained from the crystal structures of the bound states of transport components in the Protein Data Bank using PiSITE (see table S2 of the SI for more details). For each transport component the following binding sites are marked. For the Impα family: NLS-cargo (vertical black lines), CAS/Cse1 (vertical purple lines) and Nup50/Nup2 (vertical orange lines) binding sites. For CAS/Cse1: Impα (vertical black lines) and RanGTP (vertical green lines) binding sites. For RanGAP: RanGTP (vertical green lines) binding sites. For RanGEF: Ran (vertical green lines) binding sites. For NTF2: RanGDP (vertical green lines) binding sites. The Ran binding sites marked for RanGEF are taken from the RanGEF-Ran complex (an intermediate step in the RanGEF function).

(b) The number of shared contact sites between polyPR and the binding partners of the transport components, referred to as Nshared, are plotted for PR7, PR20, and PR50. In each bar plot, the numbers inside the parentheses on the horizontal axis shows the number of known binding sites obtained from PiSITE. If there is no known binding site, a (-) mark is used instead. The results for PR7, PR20, and PR50 are reported from left to right for each set of bar plots. The bars with darker colors represent longer polyPR chains.

The findings presented in figure 3b and S4, along with previous research on the interaction between polyPR and Kapβs (importins and exportins) [32], lead to the following mechanistic understanding for the direct effect of polyPR on NCT as illustrated in figure 4. In this figure the native binding interactions that are affected by polyPR are indicated by red arrows and those that are unaffected by grey arrows. In the import cycle, NLS-cargoes bind to Kapβs directly or indirectly through adaptor proteins such as Impα isomers. PolyPR may impede the loading of cargo to Kapβs and Impα isomers (as shown in inset A) by binding to the cargo-NLS sites. For Impα isomers, the binding of polyPR to cargo-NLS binding sites is mostly limited to longer chains, see figure 3b. RanGTP disassembles the import complex (cargo-Kapβ or cargo-Impα-Kapβ), and Nup2/Nup50 facilitate the disassembly of the cargo-Impα complex in the nucleus. PolyPR binding to the RanGTP binding sites on the Kapβs and the Nup2/Nup50 binding sites on Impα, see figure 3b, could result in defects in the dissociation of cargo/cargo-Impα from Kapβ and cargo from Impα (as shown in insets B and C). CAS/Cse1 export Impα by forming a complex with Impα and RanGTP. Findings in figure 3b show that polyPR also interacts with certain CAS/Cse1 binding sites that recognize Impα and RanGTP, possibly interfering with the formation of the RanGTP-CAS/Cse1-Impα complex (as shown in inset D). RanGAP plays a crucial role in the NCT cycle by mediating the hydrolysis of RanGTP to RanGDP, leading to the disassembly of export complexes (RanGTP-importin, RanGTP-CAS/Cse1-Impα, and RanGTP-exportin-cargo). The relatively high number of contacts between polyPR and RanGAP (see figure 2a), as well as polyPR binding to RanGTP binding sites on RanGAP (see figure 3b), suggest a possible defect in the dissociation of the RanGTP-importin and RanGTP-CAS/Cse1-Impα complexes in the import and Ran cycle, and of the RanGTP-exportin-cargo complex in the export cycle (as shown in inset E). Following the hydrolysis of RanGTP to RanGDP in the cytoplasm, RanGDP is transported back to the nucleus by nuclear transport factor 2 (NTF2). Once in the nucleus, the RanGDP-NTF2 complex dissociates when RanGEF exchanges GDP for GTP in Ran. Figure 3b shows that longer polyPRs interact with the Ran binding sites of RanGEF and NTF2. We therefore suggest that longer polyPRs may also interfere with the Ran cycle by hindering the loading of RanGDP to NTF2 and the exchange of GDP to GTP in Ran by RanGEF (as shown in insets G and F). The interaction between polyPR and NTF2 is distinguished by a relatively lower number of contacts (figure 2a) and contact probabilities for individual NTF2 residues (figure 3b). These findings suggest a lower likelihood of polyPR interfering with NTF2 function compared to the other functions outlined in figure 4. In the export cycle, as examined previously [32], polyPR binding to RanGTP and FG-Nup binding sites may affect cargo-loading onto exportins (as shown in inset H) and the transport of exportins through the NPC (as shown in inset I). It should be noted that the mechanisms proposed in figure 4 do not hold equal weight, and the relative contributions based on the number of contacts (presented in figure 2), and the number of contacts with important binding sites (presented in figure 3b and S4) should be considered when comparing the relative significance of the suggested mechanisms.

Suggested molecular mechanism of polyPR interference with the native function of transport components in the nucleocytoplasmic transport cycle.

(a) Proposed mechanistic pathways of polyPR interference with the import cycle (left panel), Ran cycle (middle panel), and export cycle (right panel). Steps in the NCT cycle are represented with grey arrows, and a red dashed arrow indicates where polyPR may interfere with the transport cycle. The letters A-H are used to illustrate how polyPR may disrupt the function of the transport components. Each letter corresponds to a mechanistic mechanism shown at the bottom of the figure in grey circles. It should be noted that the proposed mechanisms are not equally significant. The relative significance of the suggested molecular mechanisms can be obtained by considering their relative contributions based on the number of contacts and the number of contacts with important binding sites as presented in figures 2 and 3, respectively.


In this study we used coarse-grained molecular dynamics simulations to show that polyPR binds to several nuclear transport components. Similar to the interaction of polyPR to the Kapβ family, the interaction of polyPR to other transport components is driven by electrostatic interactions and depends sensitively on polyPR length. Reducing the salt concentration or increasing the polyPR length increases the number of contacts with different transport components. The effect of polyPR length suggests a molecular basis for the more toxic nature of longer polyPRs in animal and cell models [4, 42-44].

We observed polyPR binding to several members of the Impα family, CAS, Cse1, and RanGAP yet no binding to Ran. PolyPR strongly binds to a highly negatively charged region in the C-terminal domain of fission yeast RanGAP. The human RanGAP also contains a similar highly negatively-charged region. Our simulations therefore suggest that a direct interaction may contribute to the observed polyPR-mediated accumulation and mislocalization of RanGAP in HeLa cells [33]. PolyPR also binds to RanGEF and NTF2 at lower salt concentrations or when the polyPR length is large enough. We also showed that incorporating the dipole moment leads to an improved fit for the transport components analyzed in figure 2a, suggesting that binding is influenced not only by the net charge per residue (as previously observed for Kapβs [32]) but also by the spatial separation of charges on the transport component.

We showed that polyPR interacts with important binding sites of different transport components in a polyPR-length-dependent manner, with polyPR interaction with RanGTP/RanGDP binding sites being a common feature between the transport components. This suggests a strong polyPR interference with the Ran gradient across the nuclear envelop. For the ImpαΔN family, we observe polyPR interaction with cargo-NLS and Nup2/Nup50 binding sites. In the case of KAP60ΔN (yeast homolog of Impα), we observe polyPR interaction with Cse1 binding sites. For Cse1 (yeast CAS), we also observed polyPR interacting with Impα binding sites. These findings suggest polyPR interference with the cargo-NLS association and disassociation with Impα, and the export of Impα.

In conclusion, we showed a pronounced direct binding interaction between polyPR and a surprisingly large number of transport components. By integrating our findings with previously reported data, this work proposes a molecular model that explains how the binding of polyPR might interfere with distinct stages of the transport cycle. The intrinsic length dependence of polyPR binding to important binding sites of many transport components promotes this mechanism to a potential target for therapeutic interventions. Overall, our results offer a basis for future research that aims to explore the impact of C9orf72 R-DPRs on NCT disruption and the subsequent downstream consequences.


Coarse-grained model

We adopt a one-bead-per-amino acid (1BPA) force field to study polyPR interaction with nucleocytoplasmic transport components. This 1BPA approach has been initially developed to simulate disordered FG-Nups [25, 48-53], and extended later to study the phase separation of DPRs [35], and the interaction of polyPR with Kapβs [32]. The force field potentials and parameters in this study are identical to those employed in [32]. The bonded potentials for polyPR are residue and sequence specific. For non-bonded polyPR-polyPR interactions, the force field accounts for hydrophobic/hydrophilic and electrostatic interactions. The crystal structure of the transport components is maintained using a stiff harmonic potential ϕnetwork = K(rb)2, where K is 8000 kJ/mol/nm2 and b is the distance between the amino acid beads in the crystal structure. A bond is made between the beads if b is less than 1.4 nm. The unresolved regions in the crystal structure from the Protein Data Bank, and the regions with a lower prediction score (< 70 pLDDT) from Alpha Fold [54, 55] are included in the CG model as disordered regions. The CG model of fission yeast RanGAP includes two alpha helices in the C-terminal domain, as predicted by AlphaFold. The polyPR interactions with transport components are classified into three categories: (1) electrostatic interactions, (2) cation-pi interactions, and (3) excluded volume interactions. Our force field also accounts for the screening effect of ions. More details about the CG models and forcefield are provided in table S2 and section 1 of the SI.

Simulation and analysis

Langevin dynamics simulations are performed at 300 K at monovalent salt concentrations of 100 mM and 200 mM in NVT ensembles with a time-step of 0.02 ps and a Langevin friction coefficient of 0.02 ps-1 using GROMACS version 2018. Simulations are performed for at least 2.5 μs in cubic periodic boxes, and the last 2 μs are used for analyzing the interaction between polyPR and the transport components. The error bars in figure 2 are standard errors of the mean (SEM) calculated from block averaging with three blocks at equilibrium. The binding sites are obtained from the crystal structures of the bound states of transport components in the Protein Data Bank using PiSITE [45]. This web-based database provides interaction sites of a protein from multiple PDBs including similar proteins. The RanGTP binding data (vertical green lines) in figure 3 and S4 contains binding residues for both RanGTP and RanGppNHp, the non-hydrolysable form of RanGTP. The time-averaged number of contacts between the polyPR and transport components in figure 2 is obtained by summing the number of contacts per time frame (i.e. the number of polyPR/transport components residue pairs that are within 1 nm) over all frames and dividing by the total number of frames. The contact probability for each transport component residue is the probability of having at least one polyPR residue within 1 nm proximity of the transport component residue. The contact probability is calculated for each transport component residue by dividing the number of frames for which this contact criterion is satisfied, by the total number of frames. In figure S5 we show that conducting longer simulations does not significantly affect the contact probabilities presented in figure 3a (data shown for PR50 binding to Impα1ΔN and RanGAP), confirming convergence of our computations. Residue i is considered to be a contact site if the contact probability for this residue is larger than 0.10. Nshared is the number of transport component residues that make contact with polyPR (obtained in our simulations) and at the same time are known for recognition of native binding partners (according to PiSITE). The time-averaged total dipole moment of the CG models of transport components is calculated using gmx dipole in GROMACS.

Data availability

The data and scripts of this study are available from the corresponding author upon reasonable request.

Author contributions

H.J., E.Vd.G., and P.R.O. designed research; H.J. performed and analyzed research; and H.J., E.Vd.G, and P.R.O. wrote the paper.