Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition

  1. Clare Rollie
  2. Stefanie Schneider
  3. Anna Sophie Brinkmann
  4. Edward L Bolt
  5. Malcolm F White  Is a corresponding author
  1. University of St Andrews, United Kingdom
  2. University of Duisburg-Essen, Germany
  3. University of Nottingham, United Kingdom
10 figures and 2 tables

Figures

CRISPR spacer acquisition and Cas1.

(A, 1) The 3′-end of an incoming protospacer attacks the chromosomal CRISPR locus at the boundary between the leader sequence and repeat 1. A trans-esterification (TES) reaction (yellow arrow 1) catalyzed by Cas1 joins the protospacer to the 5′ end of repeat 1. For many integrases a (reverse) disintegration reaction can be observed in vitro. (2) Another TES reaction (yellow arrow 2) joins the other strand of the protospacer to the 5′ end of repeat 1 on the bottom (minus) strand, resulting in the formation of a gapped DNA duplex. (3) The gapped duplex is repaired by the host cell DNA replication machinery, resulting in the addition of a new spacer at position 1 and replication of CRISPR repeat 1. (B) Sequences flanking the two TES reaction sites at repeat 1 in Sulfolobus solfataricus and Escherichia coli are shown. The leader is in blue, repeat in black and spacer 1 in teal. The number of central nucleotides of the repeat omitted from the sequence is shown in parentheses. (C) Structure of Cas1 from Pyrococcus horikoshii (PDB 4WJ0) with subunits coloured blue and cyan, showing the dimeric ‘butterfly’ conformation with the active site residues highlighted in green.

https://doi.org/10.7554/eLife.08716.003
Disintegration of a branched DNA substrate by SsoCas1.

Denaturing gel electrophoresis was used to analyse the products generated by SsoCas1 with a branched DNA substrate (Substrate 1). The 5′ flap (18 nt) was released when the phosphodiester backbone was attacked by the 3′-hydroxyl group at the branch point. The reaction required active Cas1 and was independent of Cas2. DNA lengths are shown in blue (nt). The TES site is indicated with a yellow arrow and the labelled strand with a red star. (A) Shows reactions with the continuous strand (black) labelled; (B) with the flap (grey) strand labelled and (C) with the upstream (green) strand labelled, each on the 5′ end. Lanes: 1, control with no added protein; 2, WT Cas1; 3, Cas2; 4, Cas1 + Cas2; 5, Cas1 E142A variant; 6, E142A Cas1 + Cas2. (D) The 5′-flap strand was labelled on the 3′ end with a fluorescein moiety, and the flap reduced to 14 nt (Substrate 1-FAM). Cas1 catalyses the TES reaction generating a 53 nt labelled strand. Lane C: control incubation in absence of Cas1. (E) TES reactions were carried out with SsoCas1, or the E142A active site mutant, on a fork substrate containing a nicked SacI restriction site spanning the branch point (SacI substrate). A TES product of 53 nucleotides is visible in lane 2 containing Cas1. The right-hand panel shows the effect of adding SacI restriction enzyme after the TES reaction. The TES product is no longer visible, but a shorter product of 25 nucleotides is present indicating regeneration of the SacI recognition sequence by the TES reaction.

https://doi.org/10.7554/eLife.08716.004
TES activity of E. coli and S. solfataricus Cas1.

(A) E. coli Cas1 also catalyses an efficient metal dependent disintegration reaction. TES reactions were carried out under standard conditions, using Substrate 3 and varying the divalent metal ion as indicated. EcoCas1 showed robust TES activity in the presence of cobalt, magnesium and manganese. Each of the three strands of the substrate was labelled individually as for Figure 2 (5′ label indicated by a star). Lanes were: c, substrate alone; substrate incubated with Cas1 and 5 mM of E, EDTA; Co, cobalt chloride; Ca, calcium chloride; Mg, magnesium chloride; Mn, manganese chloride for 30 min at 37°C. (B) Concentration dependence of Cas1 TES activity. Substrate 3 (50 nM) was incubated with the indicated concentration of Sso or EcoCas1 for 30 min under standard assay conditions and the reactants were analysed by denaturing gel electrophoresis and phosphorimaging. SsoCas1 showed maximal activity at 250 nM, representing a fivefold molar excess of enzyme over substrate, with a decline in activity above 500 nM enzyme. EcoCas1 had maximal activity that plateaued above 250 nM enzyme. (C) Quantification of the data (raw data provided in Figure 3—source data 1). These data are representative of duplicate experiments.

https://doi.org/10.7554/eLife.08716.005
Importance of flap and 3′ terminus structure.

The importance of the released 25 nt 5′-flap structure was investigated by varying the length of duplex DNA in that arm from 0 (canonical 5′-flap) to a full 25 bp (nicked 3-way junction) (left hand panel, all based on substrate 19). All supported robust disintegration activity by SsoCas1. An intact Y- junction did not support TES activity. Lanes: C, substrate alone (1, 5, 9, 13, 17); E, SsoCas1 E142A variant 30 min incubation (2, 6, 10, 14, 18); incubation with wild-type SsoCas1 for 10 and 30 min (other lanes). The right hand panel shows the effect of replacing the attacking 3′-hydroxyl moiety at the branch point with a phosphate group (3′ phos substrate) no TES or nuclease activity was observed for either Sso or EcoCas1. C, substrate alone; E, SsoCas1 E142A variant.

https://doi.org/10.7554/eLife.08716.007
Sequence specificity of the disintegration reaction at the +1 position.

The nucleotide at the acceptor (+1) position was varied systematically to assess the sequence dependence of the disintegration reaction carried out by Cas1 from S. solfataricus (A, B) and E. coli (C, D) (Substrates 3, 6, 7, 8). In the gels on the left (A, C) each substrate was incubated with Cas1 for 1, 2, 3, 5, 10, 15, 20 and 30 min in reaction buffer prior to electrophoresis to separate the cleaved 5′-flap from the intact substrate. C–control with no Cas1 added. The plots on the right (B, D) show quantification of these assays. Data points represent the means of triplicate experiments with standard errors shown (raw data provided in Figure 5—source data 1 and Figure 5—​source data 2). The data were fitted to an exponential equation, as described in the ‘Materials and methods’, and for EcoCas1 a variable floating end point was included to allow fitting as the reaction did not go to completion. The effect of Cas2 (150 nM) on EcoCas1 (150 nM) sequence specificity for substrates (50 nM) varying at position +1 (substrates 3, 6, 7, 8) was also tested (E). The second panel from the right is a composite image from two phosphorimages of the same time course as indicated by a black line.

https://doi.org/10.7554/eLife.08716.008
Sequence specificity of the disintegration reaction at the −1 position.

The nucleotides participating in the disintegration reaction were varied systematically at the −1 position (substrates 3, 9, 10, 11). For SsoCas1 (A) there was some preference for adenine at this position, consistent with integration site 1. For EcoCas1 (B, C), a cytosine at position −1 was disfavoured over all other possibilities, even when the residue equivalent to the ‘incoming’ nucleotide was also a cytosine (substrates 15, 16, 17, 18). Each substrate was incubated with Cas1 for 5, 10 and 30 min in reaction buffer prior to electrophoresis. C–control with no Cas1 added.

https://doi.org/10.7554/eLife.08716.011
Sequence specificity of the disintegration reaction at the −2 position.

The nucleotides participating in the disintegration reaction were varied systematically at the −2 position, which is a cytosine (Sso) or guanine (Eco) at integration site 1, and variable at integration site 2 (substrates 10, 12, 13, 14). (A) SsoCas1; (B) EcoCas1. Each substrate was incubated with Cas1 for 5, 10 and 30 min in reaction buffer prior to electrophoresis. C–control with no Cas1 added. (C) For EcoCas1, the clear preference for guanine at position −2 was confirmed by more detailed kinetic analysis (raw data provided in Figure 7—source data 1) as described for Figure 5.

https://doi.org/10.7554/eLife.08716.012
Figure 7—source data 1

Nucleotide at −2 position.

https://doi.org/10.7554/eLife.08716.013
Sequence specificity of the EcoCas1 disintegration reaction for the incoming nucleotide.

The nucleotide corresponding to the incoming 3′ end of the new spacer, which is the nucleotide at the 3′ end of the 5′-flap in the disintegration substrate, was varied systematically to determine its effect on the disintegration reaction catalysed by EcoCas1 (substrates 2, 3, 4, 5). C–control with no Cas1 added. Time points were 5, 10 and 30 min.

https://doi.org/10.7554/eLife.08716.014
Disintegration of authentic E. coli integration intermediates.

Disintegration substrates corresponding to the expected site 1 and site 2 integration products arising from the integration of spacer 3 into the CRISPR array were constructed and tested (spacer 3-1 and spacer 3-2 substrates). EcoCas1 processed both, with the rate of reaction significantly higher for the substrate corresponding to site 1 (the top strand) at the leader-repeat junction. Data points represent the means of triplicate experiments with standard errors shown (raw data provided in Figure 9—source data 1).

https://doi.org/10.7554/eLife.08716.015
Figure 9—source data 1

E. coli Site 1 vs Site 2 time course.

https://doi.org/10.7554/eLife.08716.016
Reaction scheme for spacer integration and disintegration by E. coli Cas1.

The Cas 1-2 complex integrates new spacers via two joining reactions (1 and 2) at either end of the first CRISPR repeat, which differ in their sequence context. Disintegration activity by E. coli Cas1 shows clear preference for the sequence at site 1, with the guanines at position +1 and −2 particularly important. At site 2, the sequence context is not optimal for disintegration in vitro, leading to slower reaction rates. In the active site of Cas1, these nucleobases likely make specific interactions with catalytic residues, and the DNA duplex structure may be distorted.

https://doi.org/10.7554/eLife.08716.017

Tables

Table 1

Sequence of oligonucleotides used for substrate construction

https://doi.org/10.7554/eLife.08716.018
StrandSequence 5′→3′Length
1aTAGTAAGAGATTAATAAACCCTCAGATAATCTCTTATAGAATTGAAAGTTCGG53
1bTTTTTTTTTTTTTTTTTTATTATCTGAGGGTTTATTAATCTCTTACTA48
1cCCGAACTTTCAATTCTATAAGAG23
2aTAGTAAGAGATTAATAAACCCTCAGATAACCTCTTATAGAATTGAAAGTTCGG53
2bTTTTTTTTTTTTTTTTTTGTTATCTGAGGGTTTATTAATCTCTTACTA48
3bTTTTTTTTTTTTTTTTTAGTTATCTGAGGGTTTATTAATCTCTTACTA48
4bTTTTTTTTTTTTTTTTTCGTTATCTGAGGGTTTATTAATCTCTTACTA48
5bTTTTTTTTTTTTTTTTTGGTTATCTGAGGGTTTATTAATCTCTTACTA48
6aTAGTAAGAGATTAATAAACCCTCAGATAAGCTCTTATAGAATTGAAAGTTCGG53
6bTTTTTTTTTTTTTTTTTACTTATCTGAGGGTTTATTAATCTCTTACTA48
7aTAGTAAGAGATTAATAAACCCTCAGATAAACTCTTATAGAATTGAAAGTTCGG53
7bTTTTTTTTTTTTTTTTTATTTATCTGAGGGTTTATTAATCTCTTACTA48
8bTTTTTTTTTTTTTTTTTAATTATCTGAGGGTTTATTAATCTCTTACTA48
9aTAGTAAGAGATTAATAAACCCTCAGATAACATCTTATAGAATTGAAAGTTCGG53
9cCCGAACTTTCAATTCTATAAGAT23
10aTAGTAAGAGATTAATAAACCCTCAGATAACTTCTTATAGAATTGAAAGTTCGG53
10cCCGAACTTTCAATTCTATAAGAA23
11aTAGTAAGAGATTAATAAACCCTCAGATAACGTCTTATAGAATTGAAAGTTCGG53
11cCCGAACTTTCAATTCTATAAGAC23
12aTAGTAAGAGATTAATAAACCCTCAGATAACTCCTTATAGAATTGAAAGTTCGG53
12cCCGAACTTTCAATTCTATAAGGA23
13aTAGTAAGAGATTAATAAACCCTCAGATAACTGCTTATAGAATTGAAAGTTCGG53
13cCCGAACTTTCAATTCTATAAGCA23
14aTAGTAAGAGATTAATAAACCCTCAGATAACTACTTATAGAATTGAAAGTTCGG53
14cCCGAACTTTCAATTCTATAAGTA23
SacI-aTAGTAAGAGATTAATAAACCCTCAGATGAGCTCTTATAGAATTGAAAGTTCGG53
SacI-bTTTTTTTTTTTTTTCTCATCTGAGGGTTTATTAATCTCTTACTA44
1b-3′-FAMTTTTTTTTTTTTTTATTATCTGAGGGTTTATTAATCTCTTACTA-FAM48
19aCCTCGAGGGATCCGTCCTAGCAAGCCGCTGCTACCGGAAGCTTCTGGACC50
19bGCTCGAGTCTAGACTGCAGTTGAGAGCTTGCTAGGACGGATCCCTCGAGG50
19cGGTCCAGAAGCTTCCGGTAGCAGCG25
20d-10AGTCTAGACTCGAGC15
20d-5ACTGCAGTCTAGACTCGAGC20
20dTCTCAACTGCAGTCTAGACTCGAGC25
25c-dGGTCCAGAAGCTTCCGGTAGCAGCGTCTCAACTGCAGTCTAGACTCGAGC50
1c-3′PCCGAACTTTCAATTCTATAAGAG-phos25
Sp3-1aCTGGCGCGGGGAACTCTCTAAAAGTATACATTTGTTCTT39
Sp3-1bTGTAATTGATAATGTTGAGAGTTCCCCGCGCCAG34
Sp3-1cAAGAACAAATGTATACTTTTAGA23
Sp3-2aCCAGCGGGGATAAACCGTTTGGATCGGGTCTGGAATTTC39
Sp3-2bTGTTCCGACAGGGAGCCCGGTTTATCCCCGCTGG34
Sp3-2cGAAATTCCAGACCCGATCCAAAC23
Table 2

DNA constructs used in this study

https://doi.org/10.7554/eLife.08716.019
SubstrateOligonucleotide componentsJunction sequence
−2−11IC
Substrate 11a, 1b, 1cAGAT
Substrate 1-FAM1a, 1b-3′-FAM, 1cAGAT
SacI substrateSacI-a, SacI-b, 1cAGCT
Substrate 22a, 2b, 1cAGGT
Substrate 32a, 3b, 1cAGGA
3′-phos substrate2a, 3b, 1c-3′PAGGA
Substrate 42a, 4b, 1cAGGC
Substrate 52a, 5b, 1cAGGG
Substrate 66a, 6b, 1cAGCA
Substrate 77a, 7b, 1cAGTA
Substrate 81a, 8b, 1cAGAA
Substrate 99a, 3b, 9cATGA
Substrate 1010a, 3b, 10cAAGA
Substrate 1111a, 3b, 11cACGA
Substrate 1212a, 3b, 12cGAGA
Substrate 1313a, 3b, 13cCAGA
Substrate 1414a, 3b, 14cTAGA
Substrate 152a, 4b, 1cAGGC
Substrate 1611a, 4b, 11cACGC
Substrate 1710a, 4b, 10cAAGC
Substrate 189a, 4b, 9cATGC
Substrate 1919a, 19b, 19cCGGA
Gap1019a, 19b, 19c, 20d-10CGGA
Gap519a, 19b, 19c, 20d-5CGGA
Nick19a, 19b, 19c, 20dCGGA
Y-junction19a, 19b, 20c-dCGGA
Spacer 3-1 substrateSp3-1a, Sp3-1b, Sp3-1cGAGA
Spacer 3-2 substrateSp3-2a, Sp3-2b, Sp3-2cACGC
  1. The sequence of the central portion of the junction (positions −2, −1, 1 and the incoming nucleotide (IC)) for each substrate is shown. The oligonucleotides used to assemble the complete substrate are indicated.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Clare Rollie
  2. Stefanie Schneider
  3. Anna Sophie Brinkmann
  4. Edward L Bolt
  5. Malcolm F White
(2015)
Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition
eLife 4:e08716.
https://doi.org/10.7554/eLife.08716