Introduction

Since the “resolution revolution” in 2014 (1), single-particle cryo-electron microscopy (cryo-EM) has become widely used, emerging as a powerful alternative, and in some cases surpassing X-ray crystallography for structurally challenging samples. In particular, the 1.2 Å reconstruction of apoferritin, where individual hydrogen atoms are resolved, demonstrates that single-particle cryo-EM has now reached true atomic resolution for well-defined systems (2).

Despite recent advances in single-particle cryo-EM, structure determination for sub-50 kDa complexes remains challenging. Among EM structures deposited in the Protein Data Bank (PDB) since 2015 with resolutions better than 4 Å, 97% correspond to macromolecules larger than 50 kDa (3). Small complexes contain fewer atoms and scatter fewer electrons, leading to images with lower signal-to-noise ratios (SNRs). Particle alignment, typically based on calculating the cross-correlation coefficients between the particle and reference (4), depends on the amount of phase contrast (signal) in the image. The weak contrast of small complexes against back-ground noise makes it difficult to align particles accurately to calculate a high-resolution reconstruction. Moreover, single-particle experiments typically apply a total exposure of 30-50 electrons per Å2, leading to accumulating radiation damage. The diffusion of small particle due to beam-induced motion may further damage the alignable high-resolution signal (5). Extending single-particle cryo-EM to sub-50 kDa targets will open new avenues for studying critical drug-binding interactions and advance structure-based drug discovery. Approximately 75% of proteins in the human proteome are below 50 kDa (6). While X-ray crystallography is not inherently limited by small molecular weight, it requires well-ordered crystals, which are often difficult to obtain. Solution and solid-state NMR spectroscopy, on the other hand, provide atomic-level information derived from observables such as chemical shifts. However, these methods are generally limited to smaller particles, require highly concentrated samples, and data analysis is laborious, making them less suitable for high-throughput studies (79). Single-particle cryo-EM offers unique benefits compared to these two techniques, producing rich structural data in the form of images and enabling the visualization of biomolecules in near-native states at high resolution without the need for crystallization. To overcome the lower size barrier in cryo-EM, many strategies focus on increasing the apparent size and rigidity of small particles by binding them to external scaffolds or antibodies (1013). However, finding good scaffolds or antibodies is difficult, and they sometimes create structural artifacts (14). Thus, there remains an incentive to develop approaches that enable high-resolution structure determination of isolated sub-50 kDa complexes in their near-native conformations.

Theoretical estimates of the lower molecular weight limit, dating back to 1995, suggested that a 3D reconstruction at 3 Å resolution is possible for particles as small as 38 kDa, given that ∼12,600 images are averaged (15). This calculation assumed perfect images, a perfect reference to align the particle images, and a total exposure limited to 5 electrons Å2. Later, using the Rose criterion, a more optimistic prediction of 17 kDa was calculated, requiring only one-ninth as many images to be averaged (16). Since those early calculations, single-particle cryo-EM has seen significant technical advancements. The introduction of direct electron detectors (DEDs), combined with exposure weighting, now al-lows much higher electron exposures by recording data as movies and improving image resolution through motion correction (17, 18). More recently, the development of a laser phase plate offers the potential to further improve image contrast by using a high-intensity laser beam to introduce a stable and tunable phase shift to enhance low-resolution features in samples with weak contrast such as small complexes (19– 22). Additionally, cooling specimens at liquid-helium temperatures can slow the effects of radiation damage and reduce information loss during imaging (23). Recent work has shown that using gold specimen support with 100 nm diameter holes at liquid-helium temperatures allowed imaging with better quality compared to liquid-nitrogen temperatures (24). 2D template matching (2DTM) was previously developed to identify particles in cellular cryo-EM images using a high-resolution template with high accuracy (2527). In a 2DTM search, cross-correlation coefficients are calculated between the image and the template to determine the targets’ location and orientation. Previous work from our lab has shown that molecular features not in the template can be reconstructed from 2DTM-derived targets with high resolution (28). Building upon this, in this paper, we show that the alignment of sub-50 kDa complexes can be improved using 2DTM and stringent particle selection. We apply this method to a previously published dataset of the ∼41 kDa catalytic domain of protein kinase A (29) and demonstrate improved reconstruction of its ligand-binding sites. We develop a theoretical framework showing that the lower molecular weight limit can be further reduced to approximately 7.1 kDa with an exhaustive five-dimensional search, and to 5.7 kDa with a constrained search, assuming phase plate and liquid-helium cooling are used. Our findings highlight the potential of 2DTM to expand the applicability of cryo-EM to a broader range of biologically and pharmaceutically important targets, paving the way for structural studies of previously inaccessible small complexes.

Results

Unbiased reconstruction of omitted densities in a 41 kDa protein kinase

We evaluated the ability of 2DTM to recover ligand densities without model bias by performing a template matching search and reconstruction with specific features omitted from the template. We used the published dataset of a 41.3 kDa protein kinase (EMPIAR-10252) (29). Particle stacks were generated from single-particle data using coordinates output from 2DTM searches with templates missing certain structural components. A subset of particles was then selected based on 2DTM statistics and image quality measurement from fits of the contrast transfer function (CTF) of the micrographs these particles came from. The selected particle stack was used for 3D reconstruction. This “omit-template” strategy follows the baited reconstruction approach (28) and was designed to detect template bias in reconstructions calculated from targets detected by 2DTM. Here, we use it to show that we can reconstruct the density of small ligands that are not included in the template but are bound to the imaged molecules in the cryo-EM dataset. We explored a range of template deletion scenarios: from omitting ATP and residues on an alpha-helix, to removing the binding pockets around ATP at various radii. Below, we detail how each deletion strategy affected the final density map. Consistently, we found that ATP density was robustly recovered even when ATP was deleted from the 2DTM search template. These results demonstrate that 2DTM can recover ligand densities, providing biological insights into ligand binding and flexibility. The density is free of template bias where corresponding density in the template was omitted (28).

Deleting ATP, inhibitor, and an alpha-helical turn

We used the 2.2 Å X-ray model (PDBID: 1ATP, (30)) to generate a high-resolution 3D template in which the ATP, two Mn2+ ions, and six alpha-helix residues (residues 222-227) were deleted from the model. The final template, with a molecular weight of 40.7 kDa, was simulated using the simulate program in cisTEM with a uniform B-factor of 30 Å2 at a pixel size of 1.117 Å/pixel (31). Using the previously developed 2DTM p-value for particle picking and angular alignment (32), we performed 2DTM searches at Nyquist resolution and generated an initial stack of candidate particles, which were then subjected to further selection using 2DTM-derived statistics and CTF fitting parameters. A final stack of 7,353 particles were used for 3D reconstruction in cisTEM. Fig. 1a compares the cryo-EM maps from the single-particle reconstruction (29), the 2DTM template, and the 2DTM-based reconstruction. The angular distribution of the 2DTM-derived particle stack and the Fourier Shell Correlation (FSC) curve are shown in Fig. 1b. Because particles were aligned to a single high-resolution template during 2DTM, the reported 2.6 Å resolution is not a gold-standard FSC and does not reliably reflect true map quality. Fig. 1c provides a close-up view of the ATP-binding site and the deleted resides. The average Q-score between the 2DTM reconstruction and the X-ray model was calculated using the MapQ commond line tool (33, 34). Q-scores approaching 1 suggests atomic resolution, where individual atoms are resolved. Values near 0.5 indicate visible side chains while scores around 0.2 reflect unresolved side chains but resolved secondary structure (34). The Q-score for ATP was 0.605, indicating good agreement between the 2DTM reconstruction and the X-ray model. Additionally, the shape of the alpha-helix turn is clearly resolved, despite being absent in the template. Since these features were not included in the search template, they did not originate from template bias.

3D reconstruction of protein kinase using 2DTM-derived particle stack.

(a) From left to right: the single-particle reconstruction (EMDB:0409), the 2DTM template, and the 2DTM-derived reconstruction of the protein kinase (29). The x-ray structural model of the protein kinase (PDBID: 1ATP) is also shown where ATP, Mn2+, water molecules, and a segment of alpha-helix (residues 222-227) were deleted from the model. (b) Angular distribution plot and FSC curve calculated using cisTEM. (c) Densities at the ATP-binding site and deleted residues 222–227 visualized by ChimeraX (57). The average Q-scores between the 2DTM reconstruction and the X-ray model was calculated using the MapQ command line tool (33, 34).

Robust reconstruction of the ATP binding pocket

To test the robustness of 3D reconstruction as more atoms are excluded from the template, we performed a series of 2DTM searches and reconstructions using templates with residues deleted within specified radii of the bound ATP. In Fig. 2, we show the results of three different templates. We tested spherical deletion radii of 3.0 Å (Fig. 2a and b) and 5.5 Å (Fig. 2c) centered on the ATP position in the X-ray model. In addition to the ATP binding pocket, we also deleted Mn2+ in all templates and furthermore deleted IP20, an inhibitory pseudo-peptide substrate, in Fig. 2b. The molecular weights of templates and numbers of particles in the final stacks are shown in the figure.

2DTM-based reconstruction of the ATP-binding pocket.

Each row shows, from left to right: (1) the atomic model used to generating the template, with residues within a specified radius from ATP deleted (highlighted in blue); (2) the 2DTM reconstruction, fitted with the template model and colored by the nearest atoms; (3) the recovered ATP density and its average Q-score in the 2DTM reconstruction; and (4) the average Q-scores of all residues.Residues omitted from the templates are circled and colored by chain identity. Q-scores were calculated using the MapQ command line tool (33, 34). (a) Residues within 3 Å of ATP were deleted. (b) IP20 and residues within 3 Å of both ATP were deleted. (c) Residues within 5.5 Å of ATP were deleted.

In all three cases, the reconstructions showed well-defined ATP and Mn2+ densities in the binding pocket. Average Q-scores between the reconstructed maps and the X-ray model were calculated using MapQ (33, 34). The average Q-score of ATP were 0.607, 0.578, and 0.616, respectively. Although some discontinuity was observed at the chosen contour level (σ = 5) in Fig. 2b, the ATP densities in all reconstructions closely matched the ligand shape in the X-ray model. These results suggest that ATP and Mn2+ densities can be robustly recovered from the data, even when the search template lacks these ligands and nearby residues.

We also found that nearby protein residues were more affected by template deletion than ATP. When residues within 3 Å of ATP were deleted (Fig. 2a), most backbone residues were still recovered with Q-scores above 0.50, except for residues 53 (Ser) and 127 (Glu). However, when the deletion radius increased to 5.5 Å (Fig. 2c), the densities corresponding to those residues began to show visible discontinuities. This likely was caused by the overall reduction of signal in the template that contributes to the cross-correlation signal. In contrast, ATP density was consistently recovered more robustly than nearby residues. This may be because small angular misalignments disproportionately blur peripheral residues as their displacement scales with distance from the alignment center, whereas the centrally buried ATP density remains relatively unaffected.

Finally, by comparing Fig. 2a and b, we observed that deleting IP20 strongly reduced signal at several residues. IP20 not only contributes to the overall mass (2.2 kDa) of the template but also sits at the edge of the protein, where it may generate distinct low-resolution features that can facilitate alignment. Altogether, our experiments demonstrate that it is safe to delete a ligand and residues in the nearby regions to avoid template bias without destroying too much signal in the re-construction.

2DTM provides more accurate alignments than RELION refinement in omitted regions

To evaluate the quality of particles and poses obtained from 2DTM, we imported the stack of 7,353 particles described in Fig. 1 into RELION and per-formed skip-alignment 3D classification using five classes. As shown in Fig. 3a, Class 5 exhibited noticeably lower resolution than the other classes. We therefore combined Class 1-4 into a new stack of 7,197 particles and carried out both skip-alignment and regular (alignment-enabled) 3D auto-refinement in RELION. Each resulting map was then post-processed and low-pass filtered for comparison. As shown in Fig. 3b, in both maps, densities for the deleted residues were recovered. Specifically, the densities at ATP and residues 222-227 obtained with the directly imported 2DTM Euler angles were sharper and more continuous than the same region produced with RELION angular refinement. These results suggest that the 2DTM-derived orientations are more accurate than those determined by RELION. A more quantitative assessment of the alignment accuracies attained by 2DTM and RELION requires further work.

RELION processing of 2DTM-derived particle stack.

(a) RELION skip-alignment 3D classification of the 2DTM-derived particle stack in Fig. 1. Classes 1-4 were merged for 3D refinement. (b) Both skip-alignment and alignment-enabled auto-refinement were performed on the selected particles from (a). In both maps, densities for the ATP and backbone of the deleted residues were recovered.

Effective selection strategy enables reconstruction of ATP binding site from around 8k articles

An important difference between our 2DTM approach and the traditional single-particle workflow is the way particles were included in the final 3D reconstruction. In the original publication (29), a final stack of 74,413 particles was used to obtain a reconstruction at ∼4.3 Å resolution by gold-standard FSC. The resulting density, however, lacked the expected features at 4 Å, but rather appeared more consistent with a ∼6-7 Å map. In particular, the ATP-binding pocket was not resolved in the map (Fig. 1a). This particle stack included both untilted and 30°-tilted data to reduce preferred orientation. The paper noted that multiple optimizations were attempted, including iterative particle selection based on RELION metadata metrics, but none led to significant improvement.

In our analysis, we implemented a different strategy focused on more stringent particle selection. At the image level, CTF fitting scores from untilted images were computed using CTFFIND5 (35, 36), and only images with scores between 0.05 and 0.2, corresponding to well-fit CTFs, were retained for subsequent 2DTM searches (Fig. 4a). Examples of micro-graphs excluded from 2DTM searches are shown in Fig. 5. At the particle level, we applied two main selection criteria prior to 3D reconstruction:

Image statistics of the untilted micrographs in EMPIAR-10252.

(a) CTF fitting scores for 2,488 untilted images, calculated using ctffind5. Images with scores above 0.2 or below 0.05 were excluded from 2DTM searches. (b) Mean defocus values of the 2,314 images retained for 2DTM. (c) Sample thickness estimates from ctffind5, with a median thickness of 363 Å. (d) Particle counts per thickness bin, based on 17,274 particles extracted from the 2,314 images using extract-particles. Particle counts per thickness bin after particle selection using filter-particles, showing the final stack of 7,353 particles.

Examples of micrographs excluded from 2DTM search.

(a) Very low contrast and ice contamination. (b) Extremely low contrast, likely drift or astigmatism. (c) Particle aggregation or contamination. (d) Crystalline ice or fractured film.

1. 2DTM-derived statistics

Due to the small molecular weight of the target, the 2DTM z-score threshold, calculated from the cross-correlations, led to the rejection of most particles and did not yield meaningful detections. Instead, we used the newly developed 2DTM p-value approach to extract particles during the initial processing step (32). Specifically, instead of using the standard first-quadrant p-values, we calculated a three-quadrant p-value to retain particles with low 2DTM z-scores but high 2DTM SNRs. We found that a p-value threshold of 8.0 consistently gave us the best reconstruction. Following the initial extraction, as described in the Methods section, we applied additional selection steps based on several 2DTM-derived metrics, including 2DTM SNR, and the pixel-level average and standard deviation of cross-correlations from the angular search.

2. CTF-based ice thickness

We excluded images of thicker samples with reduced high-resolution signal, as indicated by their estimated sample thickness from CTF fitting (36). The distribution of mean defocus and thickness of untilted images are shown in Fig. 4b and c. The mean defocus was 9900 Å. Given that the largest dimension of the protein ki-nase is ∼65 Å, we selected images with estimated thickness between 100 and 800 Å. We found that images with thick-ness at 300-400 Å contained the most particles based on our criteria (Fig. 4d and e).

Applying these selection criteria drastically reduced the final stack size. For experiments in Fig. 1 and Fig. 2, around ∼8,000 particles were used to generate the reconstruction, only approximately 10% of what was used in the original single-particle pipeline. Despite the order-of-magnitude re-duction in data, the resulting 3D map showed a significant improvement at the ATP and IP20 binding sites. However, we note that the global FSC in Fig. 1b is not reliable due to the use of the high-resolution template and the resulting template bias (28). Nevertheless, the recovery of density not included in the template confirms that the alignment of the selected particles was sufficiently accurate to generate clear density for the binding sites.

Our experiments underline the importance of selecting good particles, rather than maximizing the number of particles selected. Previous studies have pointed out that many particles in the final stack are unnecessary and removing them can improve reconstruction (29, 37). In our experiments, we found that including a larger number of low-quality or mis-aligned particles, or false positives, may boost the global FSC but blur out weak features such as ligand densities. To test this, we generated a particle stack using a lowered p-value threshold of 7.0. This larger stack (13,669 particles) led to the degradation of signal in the deleted regions, likely due to increased false positives interfering with the true signal (Fig. 6c). Similarly, applying a 2DTM SNR threshold of 7.5 produced a particle stack of comparable size but lower quality, as demonstrated by poor reconstructions in the deleted regions (Fig. 6d). We also observed preferred orientation as shown by the angular distribution plot of the final stack in Fig. 1b. Since we applied very stringent selection, only 2,551 particles were extracted from the tilted images. However, adding these particles did not improve the density at the deleted regions, as shown in Fig. 6e. Specifically, the Q-score of the ATP density decreased from 0.605 to 0.584 when keeping other parameters constant. Although particles from tilted images provide additional angular views, the images often have thicker ice, reducing high-resolution information for accurate alignment.

2DTM reconstructions with varying particle selection parameters.

From left to right, particle stacks were generated using the following thresholds: (a) 2DTM search template where ATP, Mn2+ and residues 222-227 (blue) were deleted; (b) 2DTM p-value = 8.0, (c) 2DTM p-value = 7.0; (d) 2DTM SNR = 7.5; (e) 2DTM p-value = 8.0 with tilted data. All other parameters were kept constant as in Fig. 1. For the tilted dataset, only images with CTF scores between 0.04 and 0.13 and estimated thickness between 100 and 1000 Å were selected. The number of particles in the final stack and the Q-scores of the recovered ATP densities are indicated in the figures. Template bias were calculated using a in-house Python script adapted from (28).

Thus, careful particle curation can enable high-resolution reconstruction of sub-50 kDa complexes with an order-to-magnitude fewer particles.

Using predicted structures as templates

It is possible that experimental structures are unavailable for the target of interest. We examine whether predicted structures can be used as templates to validate the predictions, or identify novel structures or interactions. We generated a predicted structure for the protein kinase chain E using AlphaFold3 (38). The atomic model includes IP20, ATP, and Mn2+. A high-resolution template was simulated using the same parameters as above using the simulate program (31). We show the comparison between the X-ray model and AlphaFold3 model in Fig. 7a. Overall, the structures show good agreement, with an RMSD of 0.45 Å across 336 aligned residue pairs. The differences between the AlphaFold3 model and the X-ray structure of protein kinase A are mainly found in the flexible loop regions (e.g., residues 53–55), the surface-exposed side chains, and IP20. In the experiment in Fig. 2a, residues within 3 Å of ATP were removed from the X-ray-derived template. Here, the same residues were deleted from the AlphaFold3-derived template, resulting in a remaining molecular weight of 40.2 kDa. Shown in Fig. 7b, the densities of ATP in the AlphaFold3-derived reconstruction was slightly worse than that obtained using the X-ray–derived template, highlighting the importance of template accuracy in enabling efficient detection and reconstruction of small complexes. We also observed that both reconstructions resemble closely the search templates, including differences in the side chain densities (Fig. 7a). This suggests potential template bias and false positives remained in the particle stack despite the use of stringent selection criteria. Further improvements could involve systematically deleting residues from the predicted model and assessing the impact on the reconstruction, as demonstrated in (28).

2DTM reconstruction using the AlphaFold model as the search template.

(a) Left: Structural comparison between the X-ray model (PDB ID: 1ATP, gray) and the AlphaFold3 predicted model (blue) (38). Residues within 3 Å of ATP were deleted from the template and now shown. Right: 2DTM-derived maps using the AlphaFold3 template (blue) and the X-ray template (yellow). (b) Reconstruction at the ATP-binding site using the X-ray model (top) and the AlphaFold3 model (bottom) as the template (contour level σ = 5). (c) Reconstruction at residue 18 (ARG) on IP20 using the X-ray model (top) and the AlphaFold3 model (bottom) as the template (contour level σ = 4).

How small a particle can we study by single-particle cryo-EM

The cross-correlation between the particle image and the reference needs to be larger than the expected cross-correlation between noise and the reference to be alignable (16). This lower molecular weight limit of single-particle cryo-EM was estimated as 38 kDa assuming a total exposure (Ne) of 5 e/Å2 (15). Later, the estimated limit was lowered to 17 kDa (16). The primary difference between these two predictions lies in the statistical criteria used to assess image visibility: Henderson in (15) required that the intensity of the average Fourier component should be three times the standard deviation of the shot noise, while Glaeser (16) applied the Rose criterion. More recent work has shown that it is now possible to reconstruct proteins below 50 kDa and even smaller nucleic acids, with predictions that the lower molecular weight limit could be extended below 20 kDa (39). Specifically, a high-resolution 3D reconstruction of the 14 kDa hen egg white lysozyme (HEWL) was obtained from a simulated dataset generated with an ideal phase plate (40). Here, following the rationale of 2DTM, we sought to calculate the lower molecular weight limit for hydrated biological samples for single-particle cryo-EM that takes into account advancements in instrumentation made over the past decades. During 2DTM, cross-correlation coefficients are calculated between the particle image and 2D projections of the template. Peaks in the cross-correlation map indicate regions of high similarities between the image and the template. A signal-to-noise ratio (SNR) can be defined as the maximum correlation observed when aligning an image against a reference, divided by the standard deviation of the correlations from the background. We consider two scenarios: (1) the image contains pure random noise and the corresponding SNR is SNRn; (2) the image contains real phase contrast from the target and the corresponding SNR is SNRs. In both scenarios, the SNR is expected to be larger than zero because even in the presence of pure noise, a positive correlation will be obtained after aligning the noise image to a reference. Deter-mining the lower molecular weight limit is equivalent to identifying the intersection at which SNRn and SNRs are equal. When SNRs is smaller than SNRn, it will be impossible to distinguish signal from noise.

For the first case, cross-correlation coefficients between two pure Gaussian noise images with Np pixels can be approximated with a Gaussian distribution with zero mean and a variance of 1/Np, where Np is the number of pixels in the image. In cryo-EM particle alignment, a five-dimensional search is performed for each particle, including two translational parameters (x and y) and three orientational parame-ters (φ, θ, ψ). For each particle, Ns correlation coefficients are calculated to find the correct alignment. Ns is dependent on the size of the particle and the resolution limit of the alignment. A larger particle and higher resolution limit require a more finely sampled search space. The parameter set, Θ0 = {x0, y0, φ0, θ0, ψ0}, corresponding to the maximum value among Ns correlations, is then used to register the particle’s alignment. We define SNRn using the maximum and standard deviation of Ns correlation coefficients. Based on Supporting Information A,

This means the more search locations are evaluated, the higher the likelihood of observing a large correlation value purely by chance.

For the second case, we follow Henderson’s assumption that the particle is roughly spherical and consists of only randomly positioned carbon atoms (15). Cross-correlations are calculated between a M -frame summed image and 2D projections from the perfect reference. An exposure filter function, Q, is applied to reweigh the frames to maximize signal content (18). Similarly, SNRs is defined as the ratio between the maximum cross-correlation and the standard deviation of background correlations. Based on Supporting Information B,

where k is the spatial frequency, N0 is the exposure per frame, D is the particle diameter, Ni is the cumulative exposure at the ith frame, and f is the fraction of electrons being elastically scattered. Assuming a total exposure of 5 e/Å2 and a single-frame acquisition (15), we derived a simplified form of Eq. 2 by assuming a realistic CTF with multiple oscillations, such that the integral ∫k CTF(k)2dk ≈0.5. This

Under these conditions, we estimate a minimum detectable molecular weight of 38.0 kDa. While this result is numerically similar to Henderson’s original estimate (15), the underlying assumptions differ somewhat. (15) assumed ideal imaging conditions without CTF oscillations. In contrast, we assume standard cryo-EM conditions with a realistic CTF and compute the signal and noise terms using different models. In particular, our calculation of the number of correlations calculated (Ns) is different. Nevertheless, the agreement in molecular weight limit suggests that this value is a reasonable estimate given the main experimental assumptions (single frame acquisition, 5 e/Å2).

For movies with multiple frames, Eq. 2 can be numerically integrated over spatial frequency range [kmin,kmax] to calcu-late SNRs. Eqs. 1 and 2 are then plotted across a range of molecular weights to identify their intersection point, shown in Fig. 8. The parameters used for these calculations, along with their corresponding values, are listed in Table 1. Using conventional single-particle analysis conditions with a resolution limit of 2 Å, which can be achieved by collecting images using a pixel size of 1 Å/pixel, assuming a perfect beam (i.e., no envelope function), and using a total exposure of 45 e/Å2, particles as small as 14.8 kDa can be accurately aligned through a full search. If considering the inelastic scat-tering from ice with a thickness of 30 nm, particles needs to be at least 16.3 kDa to be detected, which is closely consistent with the prediction of 17 kDa in (16). Ideally, if thin ice can be obtained, which is just thick enough to embed the particle, considering the defocus variation across the particle along its diameter, and assuming 10% amplitude contrast, the smallest alignable particle is 14.8 kDa. In practice, such thin ice may be difficult to achieve, leading to an increase in the weight limit. If the low-resolution contrast of the particle allows it to be roughly centered in the x,y plane, we may not need to search the entire area covered by the particle. By constraining the translational search to a 5-by-5 pixel region, the molecular weight limit can be reduced to 11.8 kDa under previously mentioned conditions. Further incorporating a 90° phase plate and using zero defocus lower the limit to 7.4 kDa under constrained search conditions. Previous work has shown that electron diffraction spots fade more slowly at liquid-helium temperatures by a factor between 1.2 and 1.8, compared to those at liquid-nitrogen temperatures (23). Assuming an additional cryo-protection factor of 1.8, particles as small as 5.7 kDa are theoretically alignable by 2DTM.

Single-particle cryo-EM lower molecular weight limits under different assumptions.

A constrained search restricts the x and y dimensions to a 5-by-5 pixel window.

Theoretical lower molecular weight limit.

A constrained search restricts the x and y dimensions to a 5-by-5 pixel window. At the minimal molecular weight, the SNR calculated from alignment noise and phase contrast are equal.

Discussions

The reference quality may be a limiting factor of single-particle reconstruction

Images collected in (29) are of high quality, where particles in many images can be visually observed. Our CTF analysis shows that 2,108 images in the untilted dataset exhibited Thon rings extending beyond 4 Å. However, the traditional single-particle processing done in (29) was limited to 4.3 Å resolution, and features at the ATP-binding site were not resolved. In the same work, the authors analyzed two other sub-100 kDa complexes, the 82 kDa homodimeric enzyme alcohol dehydrogenase (ADH) and the 64 kDa methemoglobin (metHb), and found that very few particles from many images, rather than many particles from very few images, contributed to the final stack used for reconstruction. This was consistent with our results of selecting a subset of particles for the protein kinase dataset, where most images yielded fewer than 10 particles (Fig. 4d and e). So, what is the limiting factor of the single-particle pipeline? In our assessment, we speculate that the images themselves contain sufficient signal to support a reconstruction beyond 4.3 Å resolution. A possible contributing factor is the suboptimal reference used during 3D refinement. In the work-flow in (29), an ab initio volume generated by cryoSPARC was low-pass filtered to 20 Å and then used as the initial reference for 3D auto-refinement in RELION. Because this starting map lacked high-resolution features, particle alignment may have been less accurate. Once these misalignments were introduced, subsequent refinement iterations may not be able to fully correct them from the local refinement op-timum that was created earlier. By contrast, the SAM-IV riboswitch, despite having a comparable molecular weight, has distinct asymmetric features even at low resolution. Combined with the stronger scattering of nucleic acids, this likely facilitated more accurate alignment, allowing the traditional single-particle workflow to achieve a high-resolution reconstruction (41).

Remaining gaps between experimental and theoretical limits

Despite the progress in cryo-EM, there is still a gap between our predicted lower molecular weight limit in Table. 1 and the smallest template used in our tests (37.3 kDa). Several factors may contribute to this difference: Beam-induced motion particularly affects small particles (42). Uncorrected motion can blur high-resolution features and lead to missed detections or misalignments. Al-though Bayesian polishing can partially compensate for beam-induced motion, it depends on sufficient particle signal to model individual trajectories. For very small particles, the per-particle signal may be too weak to reliably support this analysis (43). The protein kinase dataset was collected using UltrAuFoil grids (29) which have been shown to reduce specimen motion (44). Nevertheless, beam-induced motion remains a factor limiting the signal in cryo-EM images.

Another factor is the electron beam energy used for imaging. The images were collected at 200 keV whereas the theoretical calculation presented here assumes 300 keV. Both elastic and inelastic scattering cross-sections increase at lower voltage. A recent study of sub-200 kDa complexes showed that there is no obvious choice of electron energy for imaging smaller complexes (45). Any gain in image contrast from stronger elastic scattering may be offset by the increasing inelastic scattering (45). However, the lens aberrations are more noticeable at lower energies (46). Detectors are also generally optimized for 300 keV electrons. Our calculation assumed a perfect beam whereas the experimental images suffer from spherical and chromatic aberrations. New developments in aberration correction will help obtain atomic-resolution structures of small complexes (4750).

Non-ideal experimental conditions will also widen the gap between theoretical and experimental limits. For example, we assumed the ice is just thick enough to accomodate the particle whereas the images in the protein kinase dataset have varying ice thickness as shown in Fig. 4c. Additionally, we did not perform a defocus search during 2DTM, instead, we used the average defocus values estimated by CTFFind. Incorporating the defocus search directly into 2DTM may give more accurate estimations in certain cases.

Another contributing factor is the template generation strategy. In Fig. 9, the difference map between the template and the reconstruction for the experiment in Fig. 1a reveals extra densities in the reconstruction at the ATP binding site, as well as at residues 222-227, which were deleted from the template. However, we also observed noisy densities in other regions of the difference map, indicating a limitation of the current template generation approach, which may not fully capture the solvent background.

Difference map between the template and the reconstruction shown in Fig. 1.

(a) The difference map (pink) was generated using the diffmap program (58). Contour levels are set to 5 (left) and 8 (right). (b-c) At a contour level of 8, difference densities are observed at the ATP binding site and residues 222-227.

Considering all these factors, it is clear why single-particle alignment of proteins much below ∼40 kDa remains an outstanding challenge despite the remarkable algorithmic improvement over recent years. Advancements in both micro-scope optics, detector performance, image processing algorithms, and sample preparation strategies will be essential to close this experimental–theoretical gap and make high-resolution cryo-EM of small complexes a reality.

Implications for structure-based drug design

For structure-based drug design (SBDD), obtaining a high-resolution structure of the target (and its ligand complex) is a critical step. X-ray crystallography has long been the dominant method for this purpose, but many protein targets are hard to crystallize, especially those that are flexible or membrane-embedded (51). NMR can handle very small proteins or nucleic acids in solution, but typically offers lower resolution information and is limited to proteins below 25 kDa, otherwise spectra become too complex to interpret without isotope labeling and extra processing (7, 8). Cryo-EM is emerging as a powerful alternative and complementary method and enables structural and functional studies under conditions more closely resembling the native cellular context. One of the major caveat for cryo-EM, however, is the reconstruction of sub-50 kDa complexes.

The 2DTM-based single-particle alignment and reconstruction workflow we proposed here simplifies the conventional single-particle pipeline by foregoing iterative rounds of 2D classification, ab initio modeling, 3D classification and refinement. 2DTM directly returns particles with their x, y position, Euler angles and defocus in one pass. Aside from the computational cost of the search itself, the workflow is trivial compared to a single-particle pipeline.

Another motivation for 2DTM is overcoming the size limit of cryo-EM. Conventional single-particle workflows have struggled to achieve good reconstructions for particles below 50 kDa, where low contrast makes alignment unreliable with an imperfect reference. To date, only a handful of isolated sub-100 kDa structures have been solved at < 4 Å resolution. Notable examples include the 52 kDa streptavidin tetramer resolved to 3.2 Å using a Volta phase plate and Cs corrector (52), and to 2.6 Å using graphene grids (53); the 64 kDa methemoglobin at ∼2.8 Å resolution (29); and the 39 kDa SAM-IV riboswitch at 3.7 Å resolution (41). The 2DTM-based reconstruction method we describe here improves the alignment of small particles, utilizing the high-resolution in-formation from the perfect reference. Our theoretical estimation of the lower molecular weight limit further highlights that for perfect images and a perfect reference, complexes as small as 5.7 kDa can be accurately aligned when using liquid helium cooling and a phase plate.

The workflow we describe here also has the potential to operate as a screening platform, enabling structure-guided optimization of small-molecule ligands even for targets below the traditional cryo-EM size limit. High-resolution cryo-EM is already being applied to drug targets that resist crystallization: for example, the high-resolution (up to 1.8 Å) maps of human CDK-activating kinase bound to 15 different inhibitors revealed detailed inhibitor interactions and water net-works in the active sites (54). In another case, cryo-EM captured a novel allosteric mechanism for protein inhibition of the human ATP-citrate lyase that enhances the target’s “druggability” (55). These studies show that resolving the ligand-bound structures can directly guide design of novel therapeutics. Our workflow can broaden this approach to smaller drug targets. In principle, one could incubate a sub-50 kDa target with various inhibitors then apply 2DTM using the apo structure as the template, which can be determined from in vitro experiment or AlphaFold predictions. The ligand-bound complex can be located, aligned and reconstructed using 2DTM. 2DTM thus offers a structure-based assay: binding of each inhibitor produces a distinct density feature at the binding site, streamlining hit validation.

In summary, by overcoming the ∼50 kDa barrier, 2DTM opens the door to structural studies of many previously in-accessible drug targets, with the ultimate goal of integrating cryo-EM into high-throughput SBDD.

Toward data-driven refinement of AlphaFold3 models via 2DTM

Our results demonstrate that AlphaFold3-predicted structures can be used directly as templates in 2DTM searches to pick particles and reconstruct high-resolution maps, even in the absence of any experimentally determined model. This validates the use of AlphaFold3 as a starting point for structure determination. While the predicted structure may deviate from the true conformation, as shown in Fig. 7a, the resulting cryo-EM map provides information that can guide further refinement of the AlphaFold3 model. This refined model could then be used as an improved template in a second round of particle picking and reconstruction, allowing recovery of more accurate particles and a higher-quality map. In Fig. 7b, we show a deleted residue and the reconstructed densities obtained using both the X-ray template and the AlphaFold3 template. Although the densities are weak and discontinuous, both maps are consistent with the side chain conformation observed in the X-ray model. While we have not yet tested this type of refinement, these preliminary results suggest that this data-driven approach could progressively improve both the structural model and the final reconstruction, starting entirely from prediction.

Methods

Cryo-EM data set and image processing

Unaligned movies of the protein kinase were downloaded from EMPIAR-10252 (29). The dataset contains 4,809 images, among which 2,488 are from untilted samples and 2,321 are collected at 30° tilt. Motion correction and exposure-weighting were performed using the MotionCor2 program (56)We kept the same procedure as the original publication by using 5 × 5 tiled frames with a B-factor of 250 Å2 and a binning factor of 2. We used the exposure weighted summed frames for CTF fitting using CTFFind5 (36) with a box size of 512 and a resolution range of 4-30 Å. We selected images with CTF fitting scores between 0.05 and 0.2 for the untilted dataset and 0.04-0.13 for the tilted dataset. Images that were excluded are shown in Fig. 5 for comparison.

2DTM

After selecting good micrographs based on their CTF fit, 2,314 images of untilted and 2,252 of tilted samples with a pixel size of 1.117 Å/pixel were used for 2DTM. For the experiment in Fig. 1, ATP, Mn2+, and residues 222-227 were deleted from the X-ray model (PDBID: 1ATP) before template simulation. For the experiment in Fig. 2, residues within a radius of 3 Å and 5.5 Å from ATP were deleted from the X-ray model. IP20 was also deleted for the test in Fig. 2b. Modified atomic coordinates were generated using UCSF ChimeraX (57). High-resolution 3D templates were then generated from the modified models using program simulate in cisTEM (31). A uniform B-factor of 30 Å2 was applied to all atoms. 2DTM searches were done using an angular search step of 2.5° for out-of-plane angles and 1.5° for in-plane angle for all tests with no defocus search.

Particle extraction and selection

To streamline post-processing of 2DTM, we implemented a dedicated Python toolkit 2DTM_postprocess_tool. The module contains two command-line functions:

1. extract-particles

Input arguments are (i) the cisTEM 2DTM SQLite project database, (ii) the IDs of the search and associated CTF-estimation jobs, and (iii) the image pixel size. The program extracts candidate particles in each image in the specified job using one of the three 2DTM metrics: SNR, z-score, or p-value. For the p-value, the user may choose to calculate the one-quadrant or three-quadrant p-value. In the protein kinase example, we first located local maxima in 2DTM SNR maps (exclusion radius = 10 pixels; micrograph border mask = 92 pixels to avoid truncated particles), then calculated the three-quadrant p-values and stored particles with p-values larger or equal to 8.0.

2. filter-particles

This function provides secondary quality selection metrics based on the CTF fitting quality, sample thickness, and particle-level statistics from the 2DTM angular search (mean and standard deviation of per-pixel cross-correlations across sampled orientations). For the present study we required a SNR> 6.0, an ice thickness between 100 and 800 Å, and a CTF fitting score between 0.05-0.2 for untilted images (or 0.04-0.13 for tilted). Particles whose angular search mean cross-correlation was negative or standard deviation of cross-correlations exceeded 1.1 were also discarded.

Both steps output a star file that can be used for the following 3D reconstruction.

3D reconstruction

Particle stacks processed by filter-particles and alignment parameters were imported into cisTEM as a refinement package for single-particle processing. A 3D reconstruction was generated using the cisTEM program reconstruct3d. UCSF ChimeraX was used for visualizing the final reconstructions.

RELION processing

Particles selected by 2DTM were subjected to 3D classification in RELION using a tau fudge factor of 4 and an E-step resolution limit of 7 Å, resulting in five classes. The best-resolved classes were subsequently 3D auto-refined (with or without alignment) using C1 symmetry and a 3.7° angular sampling step, with refinement performed against the corresponding 10 Å low-pass filtered template. A soft mask was then generated from the auto-refined map by applying a 15 Å low-pass filter and using a soft-edge of 6 pixels. The final map was produced by post-processing with B-factor sharpening and low-pass filtering.

Supplementary Note 1: Estimating the lower molecular weight limit

A. SNR from alignment noise

For two independent Gaussian noise images of Np pixels, the normalized cross-correlation, rn, is zero on average. The distribution of rn when Np is large can be approximated by a Gaussian distribution

In cryo-EM particle alignment, normalized cross-correlations are computed between a noisy particle image and a set of clean 2D projections generated at sampled orientations, in order to determine the best-matching position (x, y) and orientation. For each particle image, Ns cross-correlations are evaluated, and the alignment is assigned based on the position and orientation corresponding to the maximum cross-correlation. This process is equivalent to drawing the maximal value from Ns Gaussian distributed random variables with zero mean and variance of . The upper bound of the expectation of this maximum is (59)

We define the signal-to-noise ratio (SNR) of alignment noise as the number of standard deviations (SDs) by which the maximum cross-correlation exceeds the SD of cross-correlations computed across a pure noise image:

Assuming that the high-resolution limit for alignment is d = kmax. The ideal pixel size is then p = d/2. A five-dimensional search includes the following components:

  1. in-plane rotations: .

  2. Out-of-plane viewing directions:

  3. x, y shifts: .

The total number of correlations calculated during a five-dimensional search is then

B. SNR from phase contrast

Fraction of electrons being elastically scattered up to a resolution limit

When the image contains phase contrast, the SNR is defined as the number of SDs by which the normalized cross-correlation exceeds the background SD, consistent with the definition used in the 2DTM implementation (25).

We assume that the particle consists of only randomly positioned carbon atoms. The electron atomic scattering factor of carbon can be approximated as a sum of (normally) five Gaussians (60, 61):

where ai and bi are the fitting parameters up to 6 Å1 (62), and k is the spatial frequency (Å1). The differential scattering cross-section is:

where θ is the scattering angle and

where λ is the electron wavelength. The differential cross-section for a single, isolated atom is related to θ by:

We now integrate to calculate the total scattering cross-section:

Assuming protein density ρ ≈ 0.8 Da/Å3, the number of carbon atoms equivalent to a spherical protein of diameter D (Å) is (15):

The fraction of electrons being elastically scattered by the particle is:

The exit wave function at distance z below the specimen is (63)

where o(x) is the projected potential of the molecule. We can now relate the molecular weight of the particle to the Fourier component of the image:

where F (k) denotes the 2D Fourier transform of the projected Coulomb potential o(x) of the particle.

Image formation model

The wave function in Fourier space after lens aberration is (5, 63):

in which the lens aberration function (64) is

Here, Δf is the defocus (positive for underfocus) and Cs is the spherical aberration. The contrast transfer function (CTF) is defined as

The Fourier transform of the linearized intensity is (5, 63):

Eq. S16 is the linear model of image formation in cryo-EM.

Given the per-frame per-unit area exposure where N0 is the exposure per-frame, the observed noisy image of a single frame is

Where

Equivalently, in Fourier space

Where

And the noise term ni(x) is additive white Gaussian noise:

Where . We will ignore the contribution of the DC term in the cross-correlation.

Matching with a perfect reference

We now define an SNR value as the expected value of cross-correlations generated from phase contrast, divided by the SD of the correlations from noise

For an image summed over M frames, Eq. S19 is updated with an exposure filter Q(k, Ni) calculated from (18)

where Ni = i · N0 is the ith frame cumulative exposure, and

where

It follows that

ensuring the noise in the summed frame remains “white” with variance .

The summed image:

where

Using a perfect reference that is also exposure filtered

the expected CC from the signal is

For noise, the mean CC is zero and the variance is:

Thus, the SNR from phase contrast, assuming Wilson statistics (65) (flat spectrum of randomly positioned carbon atoms), is:

where the integral T can be calculated numerically.

C. Advanced assumptions

Inelastic scattering

We further add a correction for inelastic scattering using:

where t is the sample thickness and Λin is the inelastic scattering mean free path of the solution. The inelastic mean free path depends on the incident electron energy and is related to the scattering cross-section as:

where σin,sol is the weighted inelastic scattering cross-section calculated by individual atom’s inelastic scattering cross-sections (66, 67):

where Z is atomic number, β the ratio between velocity of the electron and light, U0 the incident electron energy and mc2 the rest energy of the electron.

For vitreous ice,

For example, at 300 keV, σin, sol = 343 nm.

Phase plate

Based on (35), Eq. S15 can be extended to

where

Here, Δφ is an additional phase shift introduced by a phase plate and w is the fraction of amplitude contrast (e.g., 0.07 or 0.1).

Defocus spread from particle thickness

Defocus has variation due to the particle “depth” D. We further account for this in the CTF based on (5, 36). For each depth slice z, the updated phase shift (Eq. S38) is

Hence, the depth-averaged CTF is

Liquid helium cooling

Based on the fact that electron diffraction spots fade 1.2-1.8× slower at liquid helium compared to using liquid nitrogen (23), the exposure filter function can be updated

Data availability

The current manuscript is a computational study, so no data have been generated for this manuscript. The modeling code is available in the GitHub repository referenced in the manuscript.

Acknowledgements

We thank the members of the Grigorieff lab for the fruitful discussion of this work. We are especially grateful to Dongjie Zhu for sharing and testing his new methods and for many insightful conversations.

Additional information

Code availability

The Python package is available at https://github.com/kekexinz/2DTM_postprocess_tool, and it can also be accessed in the 2DTM_postprocess_tool branch of the official cisTEM repository (https://github.com/timothygrant80/cisTEM).

Funding

Howard Hughes Medical Institute (HHMI)

  • Kexin Zhang

Howard Hughes Medical Institute (HHMI)

  • Nikolaus Grigorieff