Figures and data in Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition

Figures
Tables

10 figures and 2 tables

Figures

Figure 1

Download asset Open asset

CRISPR spacer acquisition and Cas1.

(A, 1) The 3′-end of an incoming protospacer attacks the chromosomal CRISPR locus at the boundary between the leader sequence and repeat 1. A trans-esterification (TES) reaction (yellow arrow 1) catalyzed by Cas1 joins the protospacer to the 5′ end of repeat 1. For many integrases a (reverse) disintegration reaction can be observed in vitro. (2) Another TES reaction (yellow arrow 2) joins the other strand of the protospacer to the 5′ end of repeat 1 on the bottom (minus) strand, resulting in the formation of a gapped DNA duplex. (3) The gapped duplex is repaired by the host cell DNA replication machinery, resulting in the addition of a new spacer at position 1 and replication of CRISPR repeat 1. (B) Sequences flanking the two TES reaction sites at repeat 1 in *Sulfolobus solfataricus* and *Escherichia coli* are shown. The leader is in blue, repeat in black and spacer 1 in teal. The number of central nucleotides of the repeat omitted from the sequence is shown in parentheses. (C) Structure of Cas1 from *Pyrococcus horikoshii* (PDB 4WJ0) with subunits coloured blue and cyan, showing the dimeric ‘butterfly’ conformation with the active site residues highlighted in green.

https://doi.org/10.7554/eLife.08716.003

Figure 2

Download asset Open asset

Disintegration of a branched DNA substrate by SsoCas1.

Denaturing gel electrophoresis was used to analyse the products generated by SsoCas1 with a branched DNA substrate (Substrate 1). The 5′ flap (18 nt) was released when the phosphodiester backbone was attacked by the 3′-hydroxyl group at the branch point. The reaction required active Cas1 and was independent of Cas2. DNA lengths are shown in blue (nt). The TES site is indicated with a yellow arrow and the labelled strand with a red star. (A) Shows reactions with the continuous strand (black) labelled; (B) with the flap (grey) strand labelled and (C) with the upstream (green) strand labelled, each on the 5′ end. Lanes: 1, control with no added protein; 2, WT Cas1; 3, Cas2; 4, Cas1 + Cas2; 5, Cas1 E142A variant; 6, E142A Cas1 + Cas2. (D) The 5′-flap strand was labelled on the 3′ end with a fluorescein moiety, and the flap reduced to 14 nt (Substrate 1-FAM). Cas1 catalyses the TES reaction generating a 53 nt labelled strand. Lane C: control incubation in absence of Cas1. (E) TES reactions were carried out with SsoCas1, or the E142A active site mutant, on a fork substrate containing a nicked SacI restriction site spanning the branch point (SacI substrate). A TES product of 53 nucleotides is visible in lane 2 containing Cas1. The right-hand panel shows the effect of adding SacI restriction enzyme after the TES reaction. The TES product is no longer visible, but a shorter product of 25 nucleotides is present indicating regeneration of the SacI recognition sequence by the TES reaction.

https://doi.org/10.7554/eLife.08716.004

Figure 3

Download asset Open asset

TES activity of *E. coli* and *S. solfataricus* Cas1.

(A) *E. coli* Cas1 also catalyses an efficient metal dependent disintegration reaction. TES reactions were carried out under standard conditions, using Substrate 3 and varying the divalent metal ion as indicated. EcoCas1 showed robust TES activity in the presence of cobalt, magnesium and manganese. Each of the three strands of the substrate was labelled individually as for Figure 2 (5′ label indicated by a star). Lanes were: c, substrate alone; substrate incubated with Cas1 and 5 mM of E, EDTA; Co, cobalt chloride; Ca, calcium chloride; Mg, magnesium chloride; Mn, manganese chloride for 30 min at 37°C. (B) Concentration dependence of Cas1 TES activity. Substrate 3 (50 nM) was incubated with the indicated concentration of Sso or EcoCas1 for 30 min under standard assay conditions and the reactants were analysed by denaturing gel electrophoresis and phosphorimaging. SsoCas1 showed maximal activity at 250 nM, representing a fivefold molar excess of enzyme over substrate, with a decline in activity above 500 nM enzyme. EcoCas1 had maximal activity that plateaued above 250 nM enzyme. (C) Quantification of the data (raw data provided in Figure 3—source data 1). These data are representative of duplicate experiments.

https://doi.org/10.7554/eLife.08716.005

Figure 3—source data 1 Concentration Cas1.: https://doi.org/10.7554/eLife.08716.006
Download elife-08716-fig3-data1-v2.xlsx

Figure 4

Download asset Open asset

Importance of flap and 3′ terminus structure.

The importance of the released 25 nt 5′-flap structure was investigated by varying the length of duplex DNA in that arm from 0 (canonical 5′-flap) to a full 25 bp (nicked 3-way junction) (left hand panel, all based on substrate 19). All supported robust disintegration activity by SsoCas1. An intact Y- junction did not support TES activity. Lanes: C, substrate alone (1, 5, 9, 13, 17); E, SsoCas1 E142A variant 30 min incubation (2, 6, 10, 14, 18); incubation with wild-type SsoCas1 for 10 and 30 min (other lanes). The right hand panel shows the effect of replacing the attacking 3′-hydroxyl moiety at the branch point with a phosphate group (3′ phos substrate) no TES or nuclease activity was observed for either Sso or EcoCas1. C, substrate alone; E, SsoCas1 E142A variant.

https://doi.org/10.7554/eLife.08716.007

Figure 5

Download asset Open asset

Sequence specificity of the disintegration reaction at the +1 position.

The nucleotide at the acceptor (+1) position was varied systematically to assess the sequence dependence of the disintegration reaction carried out by Cas1 from *S. solfataricus* (A, B) and *E. coli* (C, D) (Substrates 3, 6, 7, 8). In the gels on the left (A, C) each substrate was incubated with Cas1 for 1, 2, 3, 5, 10, 15, 20 and 30 min in reaction buffer prior to electrophoresis to separate the cleaved 5′-flap from the intact substrate. C–control with no Cas1 added. The plots on the right (B, D) show quantification of these assays. Data points represent the means of triplicate experiments with standard errors shown (raw data provided in Figure 5—source data 1 and Figure 5—source data 2). The data were fitted to an exponential equation, as described in the ‘Materials and methods’, and for EcoCas1 a variable floating end point was included to allow fitting as the reaction did not go to completion. The effect of Cas2 (150 nM) on EcoCas1 (150 nM) sequence specificity for substrates (50 nM) varying at position +1 (substrates 3, 6, 7, 8) was also tested (E). The second panel from the right is a composite image from two phosphorimages of the same time course as indicated by a black line.

https://doi.org/10.7554/eLife.08716.008

Figure 5—source data 1 Nucleotide at +1 position.: https://doi.org/10.7554/eLife.08716.009
Download elife-08716-fig5-data1-v2.xlsx
Figure 5—source data 2 Nucleotide at +1 position.: https://doi.org/10.7554/eLife.08716.010
Download elife-08716-fig5-data2-v2.xlsx

Figure 6

Download asset Open asset

Sequence specificity of the disintegration reaction at the −1 position.

The nucleotides participating in the disintegration reaction were varied systematically at the −1 position (substrates 3, 9, 10, 11). For SsoCas1 (A) there was some preference for adenine at this position, consistent with integration site 1. For EcoCas1 (B, C), a cytosine at position −1 was disfavoured over all other possibilities, even when the residue equivalent to the ‘incoming’ nucleotide was also a cytosine (substrates 15, 16, 17, 18). Each substrate was incubated with Cas1 for 5, 10 and 30 min in reaction buffer prior to electrophoresis. C–control with no Cas1 added.

https://doi.org/10.7554/eLife.08716.011

Figure 7

Download asset Open asset

Sequence specificity of the disintegration reaction at the −2 position.

The nucleotides participating in the disintegration reaction were varied systematically at the −2 position, which is a cytosine (Sso) or guanine (Eco) at integration site 1, and variable at integration site 2 (substrates 10, 12, 13, 14). (A) SsoCas1; (B) EcoCas1. Each substrate was incubated with Cas1 for 5, 10 and 30 min in reaction buffer prior to electrophoresis. C–control with no Cas1 added. (C) For EcoCas1, the clear preference for guanine at position −2 was confirmed by more detailed kinetic analysis (raw data provided in Figure 7—source data 1) as described for Figure 5.

https://doi.org/10.7554/eLife.08716.012

Figure 7—source data 1 Nucleotide at −2 position.: https://doi.org/10.7554/eLife.08716.013
Download elife-08716-fig7-data1-v2.xlsx

Figure 8

Download asset Open asset

Sequence specificity of the EcoCas1 disintegration reaction for the incoming nucleotide.

The nucleotide corresponding to the incoming 3′ end of the new spacer, which is the nucleotide at the 3′ end of the 5′-flap in the disintegration substrate, was varied systematically to determine its effect on the disintegration reaction catalysed by EcoCas1 (substrates 2, 3, 4, 5). C–control with no Cas1 added. Time points were 5, 10 and 30 min.

https://doi.org/10.7554/eLife.08716.014

Figure 9

Download asset Open asset

Disintegration of authentic *E. coli* integration intermediates.

Disintegration substrates corresponding to the expected site 1 and site 2 integration products arising from the integration of spacer 3 into the CRISPR array were constructed and tested (spacer 3-1 and spacer 3-2 substrates). EcoCas1 processed both, with the rate of reaction significantly higher for the substrate corresponding to site 1 (the top strand) at the leader-repeat junction. Data points represent the means of triplicate experiments with standard errors shown (raw data provided in Figure 9—source data 1).

https://doi.org/10.7554/eLife.08716.015

Figure 9—source data 1 E. coli Site 1 vs Site 2 time course.: https://doi.org/10.7554/eLife.08716.016
Download elife-08716-fig9-data1-v2.xlsx

Figure 10

Download asset Open asset

Reaction scheme for spacer integration and disintegration by *E. coli* Cas1.

The Cas 1-2 complex integrates new spacers via two joining reactions (1 and 2) at either end of the first CRISPR repeat, which differ in their sequence context. Disintegration activity by *E. coli* Cas1 shows clear preference for the sequence at site 1, with the guanines at position +1 and −2 particularly important. At site 2, the sequence context is not optimal for disintegration in vitro, leading to slower reaction rates. In the active site of Cas1, these nucleobases likely make specific interactions with catalytic residues, and the DNA duplex structure may be distorted.

https://doi.org/10.7554/eLife.08716.017

Tables

Table 1

Sequence of oligonucleotides used for substrate construction

https://doi.org/10.7554/eLife.08716.018

Strand	Sequence 5′→3′	Length
1a	TAGTAAGAGATTAATAAACCCTCAGATAATCTCTTATAGAATTGAAAGTTCGG	53
1b	TTTTTTTTTTTTTTTTTTATTATCTGAGGGTTTATTAATCTCTTACTA	48
1c	CCGAACTTTCAATTCTATAAGAG	23
2a	TAGTAAGAGATTAATAAACCCTCAGATAACCTCTTATAGAATTGAAAGTTCGG	53
2b	TTTTTTTTTTTTTTTTTTGTTATCTGAGGGTTTATTAATCTCTTACTA	48
3b	TTTTTTTTTTTTTTTTTAGTTATCTGAGGGTTTATTAATCTCTTACTA	48
4b	TTTTTTTTTTTTTTTTTCGTTATCTGAGGGTTTATTAATCTCTTACTA	48
5b	TTTTTTTTTTTTTTTTTGGTTATCTGAGGGTTTATTAATCTCTTACTA	48
6a	TAGTAAGAGATTAATAAACCCTCAGATAAGCTCTTATAGAATTGAAAGTTCGG	53
6b	TTTTTTTTTTTTTTTTTACTTATCTGAGGGTTTATTAATCTCTTACTA	48
7a	TAGTAAGAGATTAATAAACCCTCAGATAAACTCTTATAGAATTGAAAGTTCGG	53
7b	TTTTTTTTTTTTTTTTTATTTATCTGAGGGTTTATTAATCTCTTACTA	48
8b	TTTTTTTTTTTTTTTTTAATTATCTGAGGGTTTATTAATCTCTTACTA	48
9a	TAGTAAGAGATTAATAAACCCTCAGATAACATCTTATAGAATTGAAAGTTCGG	53
9c	CCGAACTTTCAATTCTATAAGAT	23
10a	TAGTAAGAGATTAATAAACCCTCAGATAACTTCTTATAGAATTGAAAGTTCGG	53
10c	CCGAACTTTCAATTCTATAAGAA	23
11a	TAGTAAGAGATTAATAAACCCTCAGATAACGTCTTATAGAATTGAAAGTTCGG	53
11c	CCGAACTTTCAATTCTATAAGAC	23
12a	TAGTAAGAGATTAATAAACCCTCAGATAACTCCTTATAGAATTGAAAGTTCGG	53
12c	CCGAACTTTCAATTCTATAAGGA	23
13a	TAGTAAGAGATTAATAAACCCTCAGATAACTGCTTATAGAATTGAAAGTTCGG	53
13c	CCGAACTTTCAATTCTATAAGCA	23
14a	TAGTAAGAGATTAATAAACCCTCAGATAACTACTTATAGAATTGAAAGTTCGG	53
14c	CCGAACTTTCAATTCTATAAGTA	23
SacI-a	TAGTAAGAGATTAATAAACCCTCAGATGAGCTCTTATAGAATTGAAAGTTCGG	53
SacI-b	TTTTTTTTTTTTTTCTCATCTGAGGGTTTATTAATCTCTTACTA	44
1b-3′-FAM	TTTTTTTTTTTTTTATTATCTGAGGGTTTATTAATCTCTTACTA-FAM	48
19a	CCTCGAGGGATCCGTCCTAGCAAGCCGCTGCTACCGGAAGCTTCTGGACC	50
19b	GCTCGAGTCTAGACTGCAGTTGAGAGCTTGCTAGGACGGATCCCTCGAGG	50
19c	GGTCCAGAAGCTTCCGGTAGCAGCG	25
20d-10	AGTCTAGACTCGAGC	15
20d-5	ACTGCAGTCTAGACTCGAGC	20
20d	TCTCAACTGCAGTCTAGACTCGAGC	25
25c-d	GGTCCAGAAGCTTCCGGTAGCAGCGTCTCAACTGCAGTCTAGACTCGAGC	50
1c-3′P	CCGAACTTTCAATTCTATAAGAG-phos	25
Sp3-1a	CTGGCGCGGGGAACTCTCTAAAAGTATACATTTGTTCTT	39
Sp3-1b	TGTAATTGATAATGTTGAGAGTTCCCCGCGCCAG	34
Sp3-1c	AAGAACAAATGTATACTTTTAGA	23
Sp3-2a	CCAGCGGGGATAAACCGTTTGGATCGGGTCTGGAATTTC	39
Sp3-2b	TGTTCCGACAGGGAGCCCGGTTTATCCCCGCTGG	34
Sp3-2c	GAAATTCCAGACCCGATCCAAAC	23

Table 2

DNA constructs used in this study

https://doi.org/10.7554/eLife.08716.019

Substrate	Oligonucleotide components	Junction sequence
Substrate	Oligonucleotide components	−2	−1	1	IC
Substrate 1	1a, 1b, 1c	A	G	A	T
Substrate 1-FAM	1a, 1b-3′-FAM, 1c	A	G	A	T
SacI substrate	SacI-a, SacI-b, 1c	A	G	C	T
Substrate 2	2a, 2b, 1c	A	G	G	T
Substrate 3	2a, 3b, 1c	A	G	G	A
3′-phos substrate	2a, 3b, 1c-3′P	A	G	G	A
Substrate 4	2a, 4b, 1c	A	G	G	C
Substrate 5	2a, 5b, 1c	A	G	G	G
Substrate 6	6a, 6b, 1c	A	G	C	A
Substrate 7	7a, 7b, 1c	A	G	T	A
Substrate 8	1a, 8b, 1c	A	G	A	A
Substrate 9	9a, 3b, 9c	A	T	G	A
Substrate 10	10a, 3b, 10c	A	A	G	A
Substrate 11	11a, 3b, 11c	A	C	G	A
Substrate 12	12a, 3b, 12c	G	A	G	A
Substrate 13	13a, 3b, 13c	C	A	G	A
Substrate 14	14a, 3b, 14c	T	A	G	A
Substrate 15	2a, 4b, 1c	A	G	G	C
Substrate 16	11a, 4b, 11c	A	C	G	C
Substrate 17	10a, 4b, 10c	A	A	G	C
Substrate 18	9a, 4b, 9c	A	T	G	C
Substrate 19	19a, 19b, 19c	C	G	G	A
Gap10	19a, 19b, 19c, 20d-10	C	G	G	A
Gap5	19a, 19b, 19c, 20d-5	C	G	G	A
Nick	19a, 19b, 19c, 20d	C	G	G	A
Y-junction	19a, 19b, 20c-d	C	G	G	A
Spacer 3-1 substrate	Sp3-1a, Sp3-1b, Sp3-1c	G	A	G	A
Spacer 3-2 substrate	Sp3-2a, Sp3-2b, Sp3-2c	A	C	G	C

The sequence of the central portion of the junction (positions −2, −1, 1 and the incoming nucleotide (IC)) for each substrate is shown. The oligonucleotides used to assemble the complete substrate are indicated.