Enabling X-ray free electron laser crystallography for challenging biological systems from a limited number of crystals

  1. Monarin Uervirojnangkoorn
  2. Oliver B Zeldin
  3. Artem Y Lyubimov
  4. Johan Hattne
  5. Aaron S Brewster
  6. Nicholas K Sauter
  7. Axel T Brunger  Is a corresponding author
  8. William I Weis  Is a corresponding author
  1. Stanford University, United States
  2. Janelia Research Campus, United States
  3. Lawrence Berkeley National Laboratory, United States
  4. Howard Hughes Medical Institute, Stanford University, United States
14 figures and 4 tables

Figures

Geometry of the diffraction experiment and calculation of the Ewald-offset distance, rh.

(A) A reciprocal lattice point intersects the Ewald sphere. The inset shows the coordinate system used in cctbx.xfel and prime. The vector S0 represents the direction of the incident beam (–z-axis) …

https://doi.org/10.7554/eLife.05421.003
Post-refinement protocol.

The flowchart illustrates the iterative post-refinement protocol, broken up into ‘microcycles’ that refine groups of parameters iteratively (blue boxes), and ‘macrocycles’. At the beginning of first …

https://doi.org/10.7554/eLife.05421.004
Post-refinement during the first macrocycle of post-refinement for myoglobin.

Shown are the values of the refined parameters and target functions during the first macrocycle of post-refinement for a representative diffraction image of the myoglobin XFEL diffraction data set. …

https://doi.org/10.7554/eLife.05421.009
Convergence of post-refinement after five macrocycles for myoglobin.

The plots illustrate the convergence of post-refined parameters, target functions, and quality indicators during post-refinement over five macrocycles. A subset of 100 randomly selected diffraction …

https://doi.org/10.7554/eLife.05421.010
Merging statistics for myoglobin.

(A) Percent completeness and (B) average number of observations plotted as a function of resolution for the myoblogin XFEL diffraction data set consisting of all 757 diffraction images (Table 1) and …

https://doi.org/10.7554/eLife.05421.011
Impact of post-refinement and number of images on electron density and model quality for myoglobin.

(A) Difference Fourier (mFo-DFc) omit maps around the heme group (which was omitted from molecular replacement and atomic model refinement) for the averaged merged, the mean-scaled …

https://doi.org/10.7554/eLife.05421.012
Quality of synchrotron vs. post-refined XFEL diffraction data sets for myoglobin.

Difference Fourier (mFo-DFo) omit maps at 1.35 Å around the heme group (which was omitted from molecular replacement and model refinement), generated from (A) the synchrotron diffraction data and …

https://doi.org/10.7554/eLife.05421.013
Impact of post-refinement on the hydrogenase diffraction data set.

(A) Difference Fourier (mFo-DFc) omit maps of one of the four Fe-S clusters (which were omitted in molecular replacement and atomic model refinement) for the averaged merged and the post-refined …

https://doi.org/10.7554/eLife.05421.014
Impact of post-refinement on the anomalous signal in the thermolysin diffraction dataset.

Anomalous difference Fourier maps for the averaged merged (A, C) or the post-refined (B, D) thermolysin XFEL diffraction data sets consisting of all 12,692 diffraction images (A, BTable 1) and a …

https://doi.org/10.7554/eLife.05421.015
Impact of post-refinement on the quality of electron density maps and models of thermolysin.

(A) Difference Fourier (mFo-DFc) maps revealing a Leu–Lys dipeptide near the zinc site for the averaged merged and the post-refined thermolysin XFEL diffraction data sets consisting of all 12,692 …

https://doi.org/10.7554/eLife.05421.016
Convergence of structure refinements for the post-refined thermolysin XFEL data set at 2.6 Å resolution, using increasing numbers of diffraction images.

(A) Average number of observations per unique hkl. (B) CC1/2 for merged subsets using 2000–12,000 images (100% completeness for all subsets). (C) Peak height (σ) in the omit map for the largest …

https://doi.org/10.7554/eLife.05421.017
Distribution of the Ewald sphere offset rh.

The histogram shows the distribution of rh calculated after post-refinement for myoglobin using 757 diffraction images. The number of observations after applying the reflection selection criteria …

https://doi.org/10.7554/eLife.05421.018
The Ewald-offset correction function.

(A) Ewald-offset correction Eoc (Equation 14) viewed as a function of the reciprocal-lattice radius (rs) and the offset distance (rh). (B) A slice through Eoc at rs = 0.003, comparing Eoc (Equation …

https://doi.org/10.7554/eLife.05421.019
Geometry of the incident and diffracted beam for polarization correction.

The diagram shows a reflection on a plane formed by its reciprocal-space vector and the -z-axis at angle ϕ. This reflection is affected by the polarization of the incoming primary beam in both the …

https://doi.org/10.7554/eLife.05421.020

Tables

Table 1

XFEL diffraction data sets used in this study

https://doi.org/10.7554/eLife.05421.005
MyoglobinClostridium pasteurianum hydrogenaseThermolysin
Space groupP6P42212P6122
Resolution used (Å)20.0–1.3545.0–1.6050.0–2.10
Unit cell dimensions (Å)a = b = 90.8, c = 45.6a = b = 111.2, c = 103.8a = b = 92.7, c = 130.5
No. of unique reflections46,55585,27319,995
No. of images* indexed75717712,692
No. of images with spots to resolution used307751957
Average no. of spots on an image (to resolution used)16283640352
Energy spectrumSASESASESASE
DetectorRayonix MX325HERayonix MX325HECSPAD
Sample delivery methodfixed targetfixed targetElectrospun jet
  1. *

    This is the number of images indexed using cctbx.xfel program, and in the case of thermolysin it is the number of images indexed for one of the two wavelengths.

  2. SASE: self-amplified spontaneous emission.

  3. CSPAD: Cornell-SLAC pixel array detector.

Table 2

Statistics of post-refinement and atomic model refinement for myoglobin

https://doi.org/10.7554/eLife.05421.006
No. images100757
Resolutiona (Å)20.0–1.35 (1.40–1.35)20.0–1.35 (1.40–1.35)
Completenessa (%)80.0 (22.2)97.7 (79.8)
Average no. observations per unique hkla4.0 (1.2)25.7 (2.0)
Averaged-mergedMean-scaled partiality correctedPost-refinedAveraged mergedMean-scaled partiality correctedPost-refined
Post-refinement parametersb
Linear scale factor G01.00 (0.00)2.79 (5.02)1.00 (1.04)1.00 (0.00)2.19 (3.83)0.89 (1.07)
B0.0 (0.0)0.0 (0.0)3.2 (7.8)0.0 (0.0)0.0 (0.0)6.2 (8.3)
γ0 (Å−1)NA0.00135 (0.00028)0.00128 (0.00022)NA0.00147 (0.00042)0.00132 (0.00034)
γy (Å−1)NA0.00 (0.00)0.00007 (0.00080)NA0.00 (0.00)0.00007 (0.00009)
γx (Å−1)NA0.00 (0.00)0.00010 (0.00011)NA0.00 (0.00)0.00008 (0.00010)
γe (Å−1)NA0.00200 (0.00)0.00344 (0.00266)NA0.00200 (0.00)0.00423 (0.00323)
 Unit cell
  a (Å):90.4 (0.4)90.4 (0.4)90.5 (0.4)90.4 (0.4)90.4 (0.4)90.5 (0.3)
  c (Å)45.3 (0.4)45.3 (0.4)45.3 (0.3)45.3 (0.3)45.3 (0.3)45.3 (0.3)
 Average Tpr Start/EndNANA19.39 (7.68)/7.17 (3.38)NANA19.83 (7.54)/6.02 (2.59)
 Average Txy (mm2) Start/EndNANA169.74 (132.56)/132.02 (104.08)NANA170.66 (144.52)/133.42 (109.58)
CC1/2 (%)81.379.686.591.895.798.2
Molecular replacement scoresc
 LLG2837.5043.5291.8264.8364.9320.
TFZ10.513.013.413.713.814.0
Structure-refinement parameters
R (%)39.428.023.521.120.317.8
Rfree (%)42.129.424.823.122.519.7
 Bond r.m.s.d.0.0060.0060.0040.0060.0060.006
 Angle r.m.s.d.1.140.980.791.031.350.86
 Ramachandran statistics
  Favored (%)98.098.098.098.098.098.0
  Outliers (%)0.00.00.00.00.00.0
  1. a

    Values in parentheses correspond to highest resolution shell.

  2. b

    Post-refined parameters are shown as the mean value, with the standard deviation in parentheses.

  3. c

    Molecular replacement scores reported by Phaser (McCoy et al., 2007): log-likelihood gain (LLG) and translation function (TFZ).

Table 3

Statistics of post-refinement and atomic model refinement for hydrogenase

https://doi.org/10.7554/eLife.05421.007
No. images100177
Resolutiona (Å)45.0–1.60 (1.66–1.60)45.0–1.60 (1.66–1.60)
Completenessa (%)83.0 (47.7)91.2 (63.5)
Average no. observations per unique hkla4.4 (1.7)7.13 (2.3)
Averaged-mergedPost-refinedAveraged-mergedPost-refined
Post-refinement parametersb
 Linear scale factor G01.00 (0.00)0.56 (1.27)1.00 (0.00)0.53 (1.22)
B0.0 (0.0)10.0 (7.0)0.0 (0.0)10.5 (6.9)
γ0 (Å−1)NA0.00132 (0.00042)NA0.00126 (0.00041)
γy (Å−1)NA0.00002 (0.00004)NA0.00002 (0.00004)
γx (Å−1)NA0.00008 (0.00009)NA0.00008 (0.00011)
γe (Å−1)NA0.00269 (0.00138)NA0.00288 (0.00160)
 Unit cell
  a (Å):110.1 (0.4)110.4 (0.3)110.1 (0.4)110.3 (0.4)
  c (Å)103.1 (0.4)103.1 (0.2)103.0 (0.4)103.0 (0.2)
 Average Tpr Start/EndNA28.20 (10.86)/5.92 (2.35)NA26.47 (12.70)/5.22 (2.72)
 Average Txy (mm2) Start/EndNA623.36 (314.57)/381.23 (198.44)NA564.30 (267.45)/
372.28 (202.28)
CC1/2 (%)62.077.371.784.8
Molecular replacement scoresc
 LLG53,352.9612.7229.11774.
 TFZ69.275.975.079.0
Structure-refinement parameters
R (%)33.425.329.122.0
Rfree (%)36.728.931.325.0
 Bond r.m.s.d.0.0060.0070.0070.007
 Angle r.m.s.d.1.431.501.681.97
 Ramachandran statistics
  Favored (%)96.397.097.096.7
  Outliers (%)0.00.00.00.0
  1. a

    Values in parentheses correspond to highest resolution shell.

  2. b

    Post-refined parameters are shown as the mean value, with the standard deviation in parentheses.

  3. c

    Molecular replacement scores reported by Phaser (McCoy et al., 2007): log-likelihood gain (LLG) and translation function (TFZ).

Table 4

Statistics of post-refinement and atomic model refinement for thermolysin

https://doi.org/10.7554/eLife.05421.008
No. images200012,692
Resolutiona (Å)50.0–2.10 (2.18–2.10)50.0–2.10 (2.18–2.10)
Completenessa (%)81.3 (24.3)96.5 (74.8)
Average no. observations per unique hkla32.8 (1.2)176.6 (2.4)
Averaged-mergedPost-refinedAveraged-mergedPost-refined
Post-refinement parametersb
 Linear scale factor G01.00 (0.00)1.65 (1.66)1.00 (0.00)2.26 (75.12)
B0.0 (0.0)23.0 (33.8)0.0 (0.0)30.1 (59.8)
γ0 (Å−1)NA0.00052 (0.00040)NA0.00051 (0.00039)
γy (Å−1)NA0.00001 (0.00003)NA0.00001 (0.00003)
γx (Å−1)NA0.00002 (0.00004)NA0.00002 (0.00004)
γe (Å−1)NA0.00110 (0.00129)NA0.00103 (0.00128)
 Unit cell
  a (Å):92.9 (0.3)92.9 (0.2)92.9 (0.3)92.9 (0.3)
  c (Å)130.5 (0.5)130.4 (0.4)130.5 (0.5)130.4 (0.4)
 Average Tpr Start/EndNA1.15 (0.49)/0.55 (0.23)NA1.15 (0.52)/0.28 (0.13)
 Average Txy (mm2) Start/EndNA168.13 (117.29)/167.72 (106.14)NA169.01 (122.20)/170.00 (122.57)
CC1/2 (%)77.793.594.398.8
Molecular replacement scoresc
 LLG3590.4491.5477.6022.
 TFZ8.99.724.124.6
Structure-refinement parameters
R (%)25.219.520.718.4
Rfree (%)29.124.023.921.1
 Bond r.m.s.d.0.0040.0020.0020.002
 Angle r.m.s.d.0.750.580.590.62
 Ramachandran statistics
  Favored (%)95.994.695.294.9
  Outliers (%)0.00.00.00.0
 Zinc peak height
  Zn(1) (σ)14.016.014.320.9
  Zn(2) (σ)3.65.17.77.1
 Average peak height for calcium ions (σ)9.711.314.216.1
  1. a

    Values in parentheses correspond to highest resolution shell.

  2. b

    Post-refined parameters are shown as the mean value, with the standard deviation in parentheses.

  3. c

    Molecular replacement scores reported by Phaser (McCoy et al., 2007): log-likelihood gain (LLG) and translation function (TFZ).

Download links