1. Biochemistry and Chemical Biology
  2. Chromosomes and Gene Expression
Download icon

Multiple serine transposase dimers assemble the transposon-end synaptic complex during IS607-family transposition

  1. Wenyang Chen
  2. Sridhar Mandali
  3. Stephen P Hancock
  4. Pramod Kumar
  5. Michael Collazo
  6. Duilio Cascio
  7. Reid C Johnson  Is a corresponding author
  1. David Geffen School of Medicine, University of California at Los Angeles, United States
  2. University of California at Los Angeles, United States
Research Article
  • Cited 0
  • Views 556
  • Annotations
Cite this article as: eLife 2018;7:e39611 doi: 10.7554/eLife.39611

Abstract

IS607-family transposons are unusual because they do not have terminal inverted repeats or generate target site duplications. They encode two protein-coding genes, but only tnpA is required for transposition. Our X-ray structures confirm that TnpA is a member of the serine recombinase (SR) family, but the chemically-inactive quaternary structure of the dimer, along with the N-terminal location of the DNA binding domain, are different from other SRs. TnpA dimers from IS1535 cooperatively associate with multiple subterminal repeats, which together with additional nonspecific binding, form a nucleoprotein filament on one transposon end that efficiently captures a second unbound end to generate the paired-end complex (PEC). Formation of the PEC does not require a change in the dimeric structure of the catalytic domain, but remodeling of the C-terminal α-helical region is involved. We posit that the PEC recruits a chemically-active conformer of TnpA to the transposon end to initiate DNA chemistry.

https://doi.org/10.7554/eLife.39611.001

Introduction

Although sometimes thought of as DNA parasites, transposable elements (TE) are widely recognized as playing prominent roles in the evolution of genomes (Biémont, 2010; Brunet and Doolittle, 2015; Volff, 2006). TE-derived sequences make up almost half of the human genome, and in some organisms like Maize, make up the vast majority of the genome (International Human Genome Sequencing Consortium et al., 2001; Springer et al., 2009). TEs can be usurped or ‘domesticated’ to perform critical functions, such as promoting DNA rearrangements essential for immunity in mammals or in the development of the micronucleus in ciliated protozoa (Baudry et al., 2009; Kapitonov and Jurka, 2005; Nowacki et al., 2009). In bacteria, mobile DNA elements promote horizontal spread of pathogenicity determinants and antibiotic resistance genes (Frost et al., 2005; Hooper et al., 2009). TEs are also exploited for genome engineering (Ivics and Izsvák, 2010; Woodard and Wilson, 2015).

Transposases have been reported to be the most frequently occurring functional group of proteins (Aziz et al., 2010). Among the four major classes of DNA transposases, the large and diverse DDE/D family that contain an RNase H fold and typically transpose through a cut-and-paste mechanism has been the most intensively studied (Hickman et al., 2010; Yuan and Wessler, 2011). Recently, the mechanism of transposition by HUH-family elements, which undergo a rolling circle replicative mechanism of DNA transfer, has been elucidated (He et al., 2015). The tyrosine- and serine-family of recombinases, which have been extensively studied in the context of site-specific recombination reactions, also promote DNA transposition. Respectively, these enzymes splice DNA through a sequential pair of single-strand exchanges or through double strand breaks, generating a transient covalent linkage between the cleaved DNA end and a tyrosine or serine on the protein (Rubio-Cosials et al., 2018; Stark, 2014; Wood and Gardner, 2015). In this study, we investigate the mechanism by which the IS607-family of serine recombinases transpose DNA. As described below, IS607-family TEs have a number of properties that are unusual among TEs, and the serine transposase structure has features unlike other serine recombinases.

Serine recombinases (SRs) have been broadly classified into three subfamilies (Smith and Thorpe, 2002). The small SRs (smSR) typically catalyze highly regulated recombination reactions between specific DNA sites that are usually on the same DNA molecule (Johnson, 2015; Rice, 2015). The serine integrase or large SR (LSR) subfamily typically promote phage integration and excision between specific sites (Smith, 2015; Van Duyne and Rutherford, 2013), but certain members promote DNA translocation reactions (Bannam et al., 1995; Wang et al., 2006). SmSRs and LSRs have their DNA binding domains (DBDs) at the C-terminal end of the protein, albeit the LSRs have a more elaborate C-terminal DNA binding and regulatory domain. The SRs found in IS607-family transposable elements, however, are distinguished by the location of their DBDs at their N-termini (experimentally confirmed below). This domain architecture is paradoxical because studies on smSRs imply that an N-terminally located DBD would be incompatible with the formation of active tetramers, which is the critical regulatory step of these reactions (Johnson, 2015; Rice, 2015; Stark, 2014).

The founding member of the IS607 family was first described by Berg and co-workers (Kersulyte et al., 2000), who also noted the relationship between the Helicobacter pylori IS607 element and annotated insertion sequence elements like IS1535 in the Mycobacteria tuberculosis genome sequence (Cole et al., 1998). IS607-family elements have been subsequently found in a wide range of bacterial species, including cyanobacteria, and in archea (Filée et al., 2007b; Kuno et al., 2010). IS607-related sequences have also been found in eukaryotic genomes and viruses, probably primarily through horizontal DNA transfer events, and have been described as the most widely distributed transposon in nature (Filée et al., 2007a; Gilbert and Cordaux, 2013).

IS607 elements encode two orfs, which often overlap in their coding sequence (Figure 1A). OrfA exhibits homology with SRs and is sufficient to mediate transposition of IS607 in E. coli (this paper and Kersulyte et al., 2000). The OrfB sequence bears a clear relationship with RuvC and Cas9, and is also present in some IS200/IS605 family members, some eukaryotic transposons, and as standalone genes (Bao and Jurka, 2013; Kapitonov et al., 2015). Surprisingly, the DNA sequences at the termini of individual IS607 elements are not related, but an inverted repeat sequence, often imperfect, is present near but at different distances from the ends of the element (Figure 1B). A common feature of the ends of IS607 elements is the presence of short directly-repeated motifs, which are positioned at different spacings with respect to each other and to the host DNA junction (Figure 1B). An additional unusual feature is that the IS607 transposition reaction does not create target site duplications (this paper and Blount and Grogan, 2005; Kersulyte et al., 2000). The absence of target site duplications may make IS607 elements useful as vehicles for delivering and subsequently removing genes from chromosomes without generating a genetic scar, similar to applications of the TE piggyBac (Woltjen et al., 2009; Woodard and Wilson, 2015).

Figure 1 with 1 supplement see all
IS607-family transposons.

(A) Overall structure of IS607-family transposons with lengths given for the orfA and orfB coding regions (amino acid residues) of elements discussed in this paper. (B) DNA sequences at the ends of IS607-family transposons. The bottom strand of the left end (b–LE) and top strand of the right end (t–RE) are aligned with flanking host DNA sequences in green. Arrows highlight common sequences (inverted repeats) between the ends, and short sequence motifs (bold type are matches) for individual elements are denoted above and below the end sequences (for IS607 and ISC1926, sequence motif lengths can be extended with A or T on either side). The transposon-host borders for each of these elements have been reassigned based on alignments with related elements in their respective genomes and sequence analysis of transposition events (IS607 and ISC1926). The termini contain a GG, and the unoccupied host target sequences also contain a GG at the exchange site (e.g., panel C). (C) Transposition by IS607 in E. coli. Top: reconstructed IS607 transposons used in the transposition assays. OrfA and orfB, when present, are transcribed from the E. coli lac promoter (P) and contain ribosome binding sites. Middle: transposition frequencies onto phage λ of IS607 derivatives. Average and standard deviations are given for IS607orfA (n = 6) and Tn5 (n = 3) as a comparative control. Bottom: an example of a λ::IS607orfA transposition product. Sequences of the IS607 ends (bold), the unoccupied target, and the left and right end junctions after insertion of IS607orfA are shown. The site of DNA exchange is boxed. Additional insertion site sequences and a compilation are given in Figure 1—figure supplement 1.

https://doi.org/10.7554/eLife.39611.002

In this work we investigate the serine transposase from three IS607-family elements: IS607 from H. pylori (Kersulyte et al., 2000), IS1535 from M. tuberculosis, and ISC1926 from the hypothermophilic archea Sulfolobus islandicus (Blount and Grogan, 2005). We confirm that OrfA is the only IS607-encoded protein required to catalyze transposition in E. coli, determine the domain structure of the three transposases, and describe X-ray structures of the OrfA catalytic domains from IS1535 and ISC1926, which exhibit remarkable differences in quaternary structure from other SR-family members. We show that OrfA from IS1535 efficiently generates paired-end complexes by an unexpected mechanism involving cooperative assembly of multiple proteins, which is both unlike other transposases studied to date and unlike synaptic complex formation by other SR-family members.

Results

IS607 transposition in vivo

We first sought to confirm and extend salient features of IS607 transposition originally described by Berg and co-workers (Kersulyte et al., 2000). We engineered tetracycline-resistant IS607 derivatives containing the left and right transposon ends and orfA or orfA+orfB genes (Figure 1C). Transposition onto λ was measured after phage induction in a recA E. coli λ lysogen, and the resulting λ lysates were used for transduction selecting tetracycline resistance. λ::IS607-tet transpositions were obtained for IS607orfA at a frequency of 1 × 10−7/pfu (Figure 1C), but no confirmed transposition events were obtained with IS607orfAB. We note that the relative expressions of orfB and orfA in the IS607 constructs are likely to be different than in the native element; nevertheless, these results indicate that OrfA is sufficient for promoting transposition and that OrfB is inhibitory, as concluded earlier (Kersulyte et al., 2000). No transposition events were obtained when OrfA contained a glycine substituted for the predicted active site serine (residue 72), consistent with OrfA catalyzing the transposition reaction through an SR mechanism. The frequency of IS607orfA-tet transposition into λ DNA was about 0.4% of that measured for the well-characterized transposon Tn5.

PCR analysis of the λ::IS607orfA-tet insertions confirmed the events were simple insertions and sequences of the new transposon-host boundaries showed that all insertions were at a GG dinucleotide target with no duplications of host sequence at the junctions (Figure 1C and Figure 1—figure supplement 1). A compilation of transposition events promoted by IS607orfA (this work) and IS607orfAB elements (Kersulyte et al., 2000) show that a (G)GG sequence is a preferred target, but no additional sequence relationships among the targets are evident (Figure 1—figure supplement 1). A GG dinucleotide at the transposon termini, together with an invariant GG at the insertion target site, is consistent with a DNA exchange reaction over a 2 bp identical sequence that is observed for other SRs.

IS607-family TnpA domain architecture

The in vivo studies indicate that OrfA, hereafter called TnpA, is the only IS607 protein required for transposition. Purified preparations of recombinant TnpA from IS607, IS1535, and ISC1926 were obtained, and each protein was shown to be active for DNA binding to its cognate transposon ends (below and not shown). To probe domain architectures, each TnpA was subjected to partial proteolysis under native conditions followed by SDS-PAGE and mass spectrometry (Figure 2A and Figure 2—figure supplement 1). In each case, a trypsin-resistant fragment representing the catalytic domain and helix E region attached to a 3- (TnpAISC1926) to 11- (TnpAISC1535) residue N-terminal segment, which is predicted to be unstructured, was generated. Trypsin also cleaves near the middle of the helix E region of TnpAIS1535 and TnpAIS607 where available crystal structures show a ~ 4 residue turn separating the N- and C-terminal sections of the helix (see below). Structural models (Phyre2) of the N-terminal domains predict winged-helix motifs that closely match protein-DNA structures present in the PDB (Figure 2—figure supplement 1).

Figure 2 with 2 supplements see all
Structures of TnpA proteins.

(A) Domain architecture of TnpA proteins. Domain structures were derived from partial proteolysis/mass spectrometry (Figure 2—figure supplement 1), X-ray crystallography for TnpAIS1535 and TnpAISC1926, and Phyre2 models for the N-terminal DBDs (Figure 2—figure supplement 1) and the TnpAIS607 CTD. S denotes the predicted active site serine residue. (B and C) X-ray structures of the dimeric CTDs of TnpAIS1535 and TnpAISC1926, respectively. The helix E region folds into a 4-helix bundle that stacks on the catalytic core and occludes the catalytic serines. (D) Structure of the smSR γδ resolvase bound to DNA (PDB code: 1GDT). Unlike the TnpA proteins, the dimer interface is over the extended E-helices (salmon), and the DBD (dark green) is at the C-terminus. (E) Subunit structures of TnpA-CTDISC1926 and γδ resolvase highlighting the common folds of the catalytic cores but different helix E structures. (F) TnpAIS1535 (blue) and TnpAISC1926 (green) dimers are aligned over the catalytic domains of subunits A (rmsd = 1.1 Å). Helices B and D at the core dimer interface and the helix E bundles are highlighted to illustrate differences.

https://doi.org/10.7554/eLife.39611.004

IS607-family TnpA structures

X-ray crystal structures for the C-terminal domains (CTDs) of TnpA from IS1535 (residues 51 – 193) and ISC1926 (residues 65 – 221) were determined and found to contain either one or two dimers in their asymmetric unit, respectively (Figure 2B,C, Figure 2—figure supplements 2A–D; Table 1). Each chain adopts a structure that includes four α-helices sandwiching four β-strands from the beginning of the CTD to the end of β4 (TnpAIS1535 residues 51 – 144 and TnpAISC1926 residues 65 – 162), a topology that is identical to that of the catalytic core of smSRs (Figure 2D,E). Pairwise structure alignments to the end of β4 between the catalytic domains of TnpAIS1535 and TnpAISC1926 and the smSRs γδ resolvase (PDB code 1GDT) and Sin (PDB code 2R0Q) dimers give rms deviations from 1.6 to 3.3 Å, even though pairwise sequence comparisons between the catalytic domains TnpA proteins and smSRs typically exhibit <30% amino acid identity (with short indels).

Table 1
X-ray diffraction data and refinement statistics.
https://doi.org/10.7554/eLife.39611.007
Structure
PDB code
ISC1926-TnpA
6DGC
IS1535-TnpA – Native
6DGB
IS1535-TnpA – SeMet
Data collection
 BeamlineAPS 24 ID-CAPS 24 ID-CAPS 24 ID-C
 Space groupC1P212121P212121
 Unit cell dimensions
 a, b, c (Å)97.1, 212.3, 61.652.6, 54.2, 104.3852.3, 54.1, 104.5
 α, β, γ (o)90.0, 126.7, 90.090.0, 90.0, 90.090.0, 90.0, 90.0
  Wavelength (Å)0.97930.97920.9792
 Resolution range (Å)*20 - 2.9 (3.0-2.9)48.1 - 2.5 (2.6-2.5)52.3 - 2.5 (2.6-2.5)
 Measured reflections711344446768063
 Unique reflections196491027519606
 Rmerge5.0 (51.1)9.9 (64.8)7.9 (75.9)
 CC1/20.99 (0.76)0.99 (0.85)0.99 (0.82)
 I/σ12.8 (1.3)6.5 (1.5)10.0 (1.3)
 Completeness (%)88.6 (56.8)95.2 (91.4)98.8 (95.0)
Refinement
 Resolution (Å)2.92.5
 No. of reflections157757951
 Rwork22.022.8
 Rfree24.626.1
 RMSD bond length (Å)0.010.01
 RMSD bond angle (o)1.151.17
 No. of atoms
  Protein39962002
  Water024
 Average B factors
  Protein76.450.2
  Solvent28.9
 Ramachandran statistics§
  Favored97.195.2
  Allowed2.94.6
  Outliers00.2
  1. *Values in parentheses refer to the highest resolution shell.

    Rmerge = Σ | I-<I> | / Σ I

  2. Calculated using 5% (IS1535) and 10% (ISC1929) of the data.

    §Percentage of residues in Ramachandran plot regions were determined using PROCHECK

Although there is considerable structural similarity between the catalytic core domains of the subunits, the quaternary structures of the TnpA and smSRs dimers are radically different. The dimerization interface of the core domains of TnpA is between helices B and D of each subunit (Figure 2B,C), which do not share contacts in the smSR dimer structures (e.g., Figure 2D). The 961 Å2 (TnpAIS1535) and 924 Å2 (TnpAISC1926) dimer interfaces within the core are relatively flat and hydrophobic, but there are a few polar connections between the subunits. By contrast, smSR subunits are associated in the dimer via their helix E regions through almost exclusively hydrophobic contacts (Figure 2D). Although the overall configurations of the TnpAIS1535 and TnpAISC1926 dimers are similar, there are significant differences in the details of the dimer interfaces within the core (Figure 2F). The TnpAISC1926 subunits are shifted apart by about 3 Å relative to TnpAIS1535, the TnpAISC1926 D helices are angled by about 15° rather than being parallel, and the TnpAISC1926 B helices are one turn longer than in TnpAIS1535.

The active site serines of each TnpA dimer (TnpAIS1535 residue 59 and TnpAISC1926 residue 74) are separated by 28.6 and 31.5 Å (Cα atoms), respectively (Figure 2B,C). This is a much longer distance than would be predicted to catalyze cleavage of scissile phosphates across the minor groove of B-DNA, assuming a 2 bp staggered cleavage (~14 Å separation) that is common to other SRs. An even longer separation between active site serines is present in the catalytically-inactive dimers of γδ resolvase and Sin (Mouw et al., 2008; Yang and Steitz, 1995).

The helix E regions of the TnpA dimer structures are also completely different from those of other SRs (Figure 2E). After β4 in the TnpA dimers, a poorly structured 9 – 10 residue peptide travels along one side of the active site to connect to the helix E region (Figure 2—figure supplement 2D). The E helices are interrupted by a four residue β-turn (GRRG in TnpAIS1535 and GMRS in TnpAISC1926) and fold into an antiparallel structure. The split E helices from each subunit associate into a 4-helix bundle, with the C-terminal segments of TnpAIS1535 helix E rotated 35° relative to those of TnpAISC1926 (Figure 2F). The helix E region excludes a total of about 3685 Å2 of solvent accessible surface area in both proteins and would sterically prevent DNA from associating with the active sites (Figure 2B C F). The helix E conformation and the separation of active site serines indicate that this dimer conformation cannot be active for DNA chemistry (see also Boocock and Rice, 2013).

The structures of the IS1535 and ISC1926 TnpA dimers are very similar to SRs from Methanocaldococcus jannaschii (PDB code 3LHK;) and Sulfolobus solfataricus (PDB codes 3ILX and 3LHF) (Figure 2—figure supplement 2E), which have been discussed previously (Boocock and Rice, 2013). Nevertheless, because the quaternary structures of the TnpA-like proteins are so different from other SRs and because these differences have profound functional implications, we tested aspects of the dimeric structure by cysteine crosslinking. Cysteines were substituted at TnpAIS1535 residues within the catalytic core and helix E region where they would be proximal and oriented appropriately for intersubunit disulfide formation (Figure 2—figure supplement 2F). F126C, located just before the start of helix D, and Q138C at the C-terminal end of helix D efficiently formed dimers after oxidation. Within the helix E bundle, L162C generated substantial amounts of covalently-linked dimers and A182C generated a small amount of dimers after oxidation. These solution results substantiate the dimeric structures observed by crystallography.

Binding of TnpA to the transposon ends

DNA binding by full-length TnpA proteins of IS607, IS1535, and ISC1926 to their respective transposon ends was observed by gel mobility shift assays (EMSAs). Binding by the TnpAIS1535 to its left end (LE) was the most robust so we focus on IS1535 in the analysis below. As expected, no DNA binding was observed for TnpAIS1535 missing residues 1 – 50 comprising the N-terminal winged-helix domain.

Figure 3A shows complexes formed with increasing amounts of TnpAIS1535 incubated with radiolabeled DNA probe of the IS1535 LE plus adjacent host DNA and separated by native PAGE. Formation of a slowly migrating complex is accompanied by loss of the free LE probe. We show in Figure 3F (lanes 2 – 10) that the slowly migrating complex contains two LE DNA segments (i.e., a pair-end complex, PEC) by incubating TnpA with the radiolabeled 140 bp probe plus excess unlabeled 240 bp LE fragments. This results in formation of a supershifted complex, demonstrating the presence of both the labeled and unlabeled LE DNA fragments. Most of the LE probe associates into PECs with <10 nM TnpA (Figure 3A,B). A much lower level of a complex (complex 1) that accumulates with increasing TnpA concentration is also evident, and a small amount of an additional complex (complex 2) is formed at high TnpA concentrations. Appearance of complex 2 is accompanied by a similar decrease of PECs. Formation of PECs is strongly enhanced by Mg2+, Ca2+, Mn2+, or spermidine; in the presence of EDTA, PEC levels severely decrease and complex 1 coordinately increases (Figure 3—figure supplement 1). A time course of PEC assembly on left ends by 10 nM TnpA indicates that PECs form relatively slowly, requiring about 30 min to reach maximum levels (Figure 3C,D). Neither time course experiments performed at the optimal 37° or at lower temperatures (not shown), where both rates of formation are slower and yields of PECs are decreased, provide evidence that complex 1 is a kinetic intermediate.

Figure 3 with 1 supplement see all
Binding of IS1535 TnpA to transposon ends.

(A) Increasing amounts of TnpA (1 to 128 nM in 2-fold increments) were incubated with a 149 bp 32P-labeled DNA fragment containing the left transposon end and adjacent host sequence. After 1 hr at 37°C, the samples were subjected to native PAGE. The locations of unbound probe (free LE), paired-end complex (PEC), complex 1 (c1) and complex 2 (c2) are denoted. (B) Plot showing relative amounts of the PEC, complex 1, and complex 2 as a function of TnpA concentration. The insert expands the lower TnpA concentration range leading to maximum levels of PECs. (C) Time course of LE-PEC formation. TnpA (8 nM) was incubated with the LE probe at 37°C for increasing times as denoted and applied to a native gel. (D) Plot of the accumulation of LE-PECs and complex 1 as a function of time. (E) TnpA complexes formed on the right end. Reactions were performed as in panel A except that a 139 bp RE DNA probe was used. (F) Formation of hetero-PECs with different lengths LE or RE DNA fragments. In lanes 2 and 12, 100 nM TnpA was incubated with 0.5 nM 149 bp radiolabeled LE probe (*LE). In lanes 3 – 10, increasing amounts of unlabeled 240 bp LE fragments (2 to 128 nM, in 2-fold increments) were included in the reaction. Radiolabeled PECs, but not complex 1 or 2, shift to a slower migrating species in the presence of excess 240 bp LE fragments indicating that these complexes contain both 149 and 240 bp LE DNA molecules. In lanes 13 – 19, increasing amounts of unlabeled 230 bp RE fragments (2 to 128 nM, in 2-fold increments) were included in the reaction with *LE. A small amount of LE + RE PECs form at high RE concentrations. Lanes 1 and 11 are *LE only.

https://doi.org/10.7554/eLife.39611.008

Gel mobility shift assays performed on the right transposon end (RE) generate a different profile (Figure 3E). Only a small amount of PECs (3% of total probe) are generated, peaking at 8 nM TnpA, whereas complex 1 continues to increase to become the dominant product at high TnpA concentrations. The RE is also inefficient at forming PECs with the LE (Figure 3F, lanes 12 – 19). The poor substrate activity of the RE correlates with the presence of only two sequence motifs (Figure 1B).

TnpAIS1535 binds over a remarkably long DNA segment in LE-PECs

TnpAIS1535 LE-PEC assembly reactions were subjected to DNase I footprinting. Protections from DNase I cleavage occurred from LE bp 7 (LE 7; LE 11 on the bottom strand) and extend internally to about LE 75 at TnpA concentrations generating PECs (Figure 4A,E). DNA sequences over motifs a-d show particularly strong protections together with a series of cleavage enhancements that are separated by about 10 bp. The protected region, albeit weaker, continues internally from motif d to about LE 75. Clear evidence of TnpA binding over motif e is present, but surprisingly weak protections are detectable at nucleotides surrounding the transposon-host junction. Notably, sequences outside of core motifs a-d become protected with increasing TnpA concentrations coordinately with the core motifs, implying cooperative binding of TnpA over about 70 bp of the LE concurrent with formation of the PEC (Figure 4—figure supplement 1).

Figure 4 with 1 supplement see all
Footprint analysis of IS1535 TnpA binding to the transposon ends.

(A) DNase I footprints of TnpA to 5′ end-labeled bottom (left panel) and top (right panel) strands of the LE. TnpA concentrations were from 4 to 128 nM, in 2-fold increasing concentrations, 0 is no TnpA added, and ATCG are dideoxy sequencing lanes primed by the same oligonucleotide used to prepare the footprinting probe. Numbers on the left denote transposon sequence coordinates and are positioned relative to the 0 lane. The black bar on the right marks transposon sequences with arrows showing motif locations. The dashed line denotes regions of significant changes in DNase I cleavage by TnpA. See Figure 4—figure supplement 1 for EMSAs of binding reactions just prior to DNase I digestion showing relative amounts of PECs. (B) Boundaries of TnpA binding to the LE delineated by Exo III digestion. PEC-assembly reactions, containing from 1 to 128 nM TnpA in 2-fold increasing concentrations, were incubated with Exo III for 30 min. Lane 0 is no TnpA and -exo is no Exo III added. Solid arrowheads indicate major Exo III digestion stops, and open arrowheads denote minor Exo III stops that are TnpA dependent. (C) Time course of Exo III digestion on LE PECs. Preassembled PECs were subjected to Exo III digestion for 0 – 40 min as labeled. (D) Exo III digestion stops on the RE. Reactions were the same as in panel B except that 5′ end-labeled DNA probes representing the RE DNA strands were used. (E) Summary of DNase I and Exo III footprinting data on the LE and RE sequences. Changes in DNase I reactivity by TnpA are denoted with blue lines; dashed lines indicates weak protection. Red arrows denote Exo III digestion stops; shorter arrows signify minor stops and arrows in parentheses are stops appearing after long digestion times. IS1535 end sequence motifs (open arrows) are positioned above the sequence.

https://doi.org/10.7554/eLife.39611.010

Digestion by Exo III, a 3′ to 5′ exonuclease, generates a weak TnpAIS1535-dependent stop at LE 8 and a strong stop at LE 18 on the top strand (Figure 4B,E). On the bottom strand, Exo III digestion stops occur at LE 78/77 (weak) and LE 68/67 (strong). Increasing Exo III digestion times on LE-PECs suggest that the nuclease can progressively remove 10 bp blocks of TnpAIS1535-mediated protection (Figure 4C). For example, the weaker stop at LE 8 near the host boundary is nearly lost at long digestion times, and longer digestion times on the bottom strand result in loss of the LE 78/77 stop, increasing amounts of the LE 68/67 stop, and a new product at LE 59/58. Taken together, the Exo III and DNase I footprinting results indicate that strong TnpA binding to the LE occurs between approximately LE 18 and LE 67 with weaker binding extending at least 10 bp in both directions. Both footprinting methods indicate weak, if any, binding over and adjacent to the transposon-host boundary.

TnpAIS1535 binds only over the two motifs on the IS1535 RE

As described above, TnpA binding to the IS1535 RE primarily forms a complex I product (Figure 3E). Incubation of TnpA-RE reactions with Exo III resulted in digestion stops at RE 7 (top strand) and RE 28 (bottom strand), which flank the two motifs present on this end (Figure 4D,E). TnpA was unable to protect the IS1535 RE from DNase I cleavage, although weak cleavage enhancements were evident at positions within the two motifs that are analogous to the strong enhancements observed in the LE motifs (not shown). The 20 bp Exo III protected region on the RE provides evidence that complex I reflects TnpA binding to two adjacent motifs.

LE-PEC formation requires IS1535 motifs a-d plus flanking non-specific DNA sequences

Gel mobility shift assays on probes with progressively truncated endpoints internal to the LE reveal that about 84 bp are required for robust PEC formation (Figure 5, top panel, and Figure 5—figure supplement 1A). Less efficient PEC assembly is observed with LE segments deleted down to 69 bp, with substrates containing endpoints at LEΔ74 and LEΔ69 generating faster migrating PECs, suggesting fewer molecules of TnpA in the complex. No detectable PECs form with a substrate truncated at LEΔ64. Amounts of complex I generally increase as PEC levels decrease until LEΔ44 where levels of complex I diminish, and LEΔ39, where complex I is not detectable. Addition of non-specific DNA to the LE 39 end restores complex I formation (Figure 5—figure supplement 1D).

Figure 5 with 1 supplement see all
DNA sequence requirements for IS1535 LE-PEC assembly.

Top series are LE truncations beginning internal to the transposon. Middle series are truncations beginning within host DNA (H) flanking the transposon. Bottom series are LE sequences from transposon nt 20 to various internal endpoints embedded in vector DNA. PEC assembly was averaged from at least three different experiments for each probe. The concentrations of TnpAIS1535 required for 50% conversion of the probe to PECs are listed; if <50% of the probe was converted to PECs, the maximum yield of PECs obtained over the TnpA titration series (up to 128 nM TnpA) is given in parentheses. ND indicates PECs are not detected in the EMSAs. The presence of complex 1 is denoted by +, absence by -, and barely detectable levels by +/-. See Figure 5—figure supplement 1 for supporting data.

https://doi.org/10.7554/eLife.39611.012

Resections of host DNA and sequences at the transposon end result in moderately decreasing efficiencies of PEC formation, with LE5Δ, which removes 4 bp of the transposon end, requiring about 10-fold more TnpA than full length substrates (Figure 5, middle panel, and Figure 5—figure supplement 1B). Low levels of PECs are generated with LE10Δ, LE15Δ, and LE20Δ, which remove DNA up to the beginning of motif a, and PECs are not detectable with LE25Δ, which removes part of motif a. As observed with the upstream resections, complex 1 levels increase somewhat as PEC assembly becomes less efficient but decrease markedly with the LE25Δ truncation where motif a is disrupted.

The upstream and downstream truncation series define the minimal LE DNA segment required for detectable PEC assembly to be between LE 69 and LE 20. These boundaries are consistent with the major Exo III protected borders between LE 67 and LE 18. Appending vector DNA onto the LE20Δ junction (LE20v) fully restores efficient PEC assembly (Figure 5—figure supplement 1D). However, appending vector DNA onto the LE25Δ junction did not enable PEC formation or increase levels of complex 1. Although LEΔ64 is inactive for PEC formation (Figure 5, top panel, Figure 5—figure supplement 1A), appending vector DNA onto the LEΔ64 end (in the context of LE20v) fully restores PEC assembly (Figure 5, bottom panel, and Figure 5—figure supplement 1C). PEC formation remains efficient on a probe containing transposon sequences down to LE 54 when fused to vector DNA; LE(v54-20v) contains part of motif d through motif a. Removal of transposon sequences into motif c (v49-20v and v44-20v), however, markedly decreases PEC formation, and a substrate containing only LE transposon sequence comprising motifs a and b (v39-20v) only forms barely detectable levels of PECs but generates complex 1 (Figure 5, bottom panel).

Taken together, these results demonstrate that transposon sequences contained in motifs a-d (LE 59 to LE 20) encompass the minimal IS1535 DNA required for efficient PEC assembly. However, PEC formation requires at least an additional 10 bp of non-specific DNA upstream of motif d (an additional 25 bp for robust formation), and about 30 bp of nonspecific DNA downstream of motif a for fully efficient PEC formation.

IS1535 LE core motif sequences nucleate formation of the PEC

LE(v54-20v) efficiently forms PECs even though it contains only 35 bp of transposon sequence corresponding to motif a through part of motif d (Figure 5, bottom panel, and Figure 5—figure supplement 1C). TnpAIS1535 strongly protects sequences from DNase I cleavage on LE(v54-20v) PECs over the core motifs a-d, and protections extend into vector sequences on either side of the core motifs at least as far as observed for the native LE (Figure 6,C). The major Exo III stops on LE(v54-20v) are at the boundaries of motifs a-d (Figure 6B,C).

Footprint analysis of IS1535 deletion substrate LE(v54-20v) containing the minimal transposon sequences required for efficient PEC assembly.

(A) DNase I footprints of PEC assembly reactions on 5′-32P-labeled bottom and top strands of LE(v54-20v). TnpA concentrations were from 4 to 128 nM in 2-fold increasing amounts. Shaded rectangles on the left of the gels denote the positions of transposon sequences; coordinates labeled with v are vector sequences with vH being the equivalent locations of host DNA. The bars on the right of the gels denote regions of significant changes in DNase I reactivity by TnpA with dashes indicating weakly protected regions. (B) Exonuclease III delineated boundaries of TnpA binding. TnpA concentrations are the same as in panel A. (C) Summary of DNase I (strongly protected regions, blue) and Exo III digestion boundaries on the LE(v54-20v) sequence. Small letters denote vector sequence.

https://doi.org/10.7554/eLife.39611.014

The profile on LE(v54-20v) suggest a mechanism by which TnpAIS1535 binds cooperatively and with high affinity to the four core motifs a-d, even with only half of the native motif d sequence present. Additional molecules of TnpA then spread in either direction from the core ‘nucleation’ segment in a sequence-independent manner. When the motif d sequence is completely absent, as in LEΔ49 (Figure 5, bottom panel, and Figure 5—figure supplement 1C), PEC formation is inefficient, and no PECs are formed when motif a is partially removed (LE25v, Figure 5—figure supplement 1C).

The helix E region, but not the TnpA catalytic core domain, is remodeled during PEC assembly

As discussed above, the quaternary structure of TnpA solution dimers are very different from other serine recombinases and are predicted to be in an inactive conformation for DNA chemistry. We asked whether conformational changes were required for cooperative DNA binding and formation of the PEC. Single-cysteine derivatives of TnpAIS1535 were oxidized to form disulfide-linked dimers and evaluated for their ability to assemble PECs. Cys126 near the N-terminal end of helix D and Cys138 at the C-terminal end of helix D were efficiently oxidized into covalent dimers that lock the two dimeric subunits together within the catalytic domain core (Figure 7A,B). Both of these disulfide-linked mutant dimers efficiently formed PECs (Figure 7C,D). We conclude that a rearrangement of subunits within the catalytic core of the dimer is not required for PEC assembly.

Figure 7 with 1 supplement see all
Activities of crosslinked TnpAIS1535 dimers.

(A) TnpAIS1535 dimer structure highlighting residues 126, 138, and 162, which are modeled as cysteines in rotomers compatible for disulfide formation. The helix E region on the right is rotated clockwise in the Y plane about 90° to better visualize Cys162. (B) Non-reducing SDS-PAGE of reduced and oxidized preparations of TnpAIS1535 mutants containing single cysteine residues. The three native cysteines were replaced with serines in these mutants. (C–E) EMSAs of PEC assembly by reduced and oxidized preparations of Cys126, Cys138, and Cys162 mutants, respectively. The LE probe was incubated with 1 to 64 nM TnpA mutant in 2-fold increasing concentrations. (F) PEC assembly by wild-type TnpA and a deletion mutant missing the helix E region (residues 147 – 193). TnpA concentrations are the same as in panels C-E except that a reaction with 128 nM TnpAΔ(147-193) was included. The location of residue 146 at the C-terminus of this mutant is shown in panel A.

https://doi.org/10.7554/eLife.39611.015

Cys162, within the helix E region, generated only about 40% disulfide-linked dimers upon oxidation (Figure 7A,B). Nevertheless, oxidized Cys162 was completely defective in PEC formation, whereas the reduced cysteine mutant was active (Figure 7E). In an additional experiment, we removed the entire helix E region. TnpAΔ(147-193), which contains residues up to the end of β4, was defective for PEC formation (Figure 7F). However, at very high protein concentrations a small amount of product migrating as a PEC is observed. Incubation of TnpAΔ(147-193) with the RE fails to generate PECs, as expected, but products migrating as complex 1 are formed (Figure 7—figure supplement 1), indicating the truncated protein remains active for forming this species. The properties of the helix E deletion mutant provide evidence that helix E performs an important function in cooperative binding to form PECs but probably not directly in synaptic interactions because a small amount of LE-LE PEC can form. The inability of the covalently-crosslinked E helices to assemble PECs suggests that a conformational rearrangement of the helix E region of the dimer is required, perhaps to enable helix E interactions between adjacent dimers bound to the transposon ends.

Discussion

The IS607 family of DNA transposable elements exhibits many features that are not typically found in other transposable elements. The ends of IS607 elements are not bordered by terminal inverted repeated sequences, and there are no duplications of target sequence at the transposon-host DNA borders; however, multiple short sequence motifs internal to the ends are present (this paper, Blount and Grogan, 2005; Kersulyte et al., 2000). Most IS607-family members terminate in GG, and in the cases of IS607 and ISC1926, insert into a GG target sequence. IS607-family elements encode two orfs whose coding sequences encompass nearly all of the DNA between the transposon ends. For IS607, orfA/tnpA is necessary and sufficient for transposition in E. coli (this paper and Kersulyte et al., 2000). TnpA binds specifically to the transposon ends and is a member of the SR family of DNA exchange enzymes, which are most often associated with site-specific recombination reactions. Residues implicated in catalysis by SRs are well conserved within IS607-family TnpA proteins, and we demonstrate here that the presumed active site serine is required for transposition. Strikingly, however, the dimeric structure is radically different from other SRs. OrfB, whose presence appears to negatively impact IS607 transposition rates in E. coli (this paper and Kersulyte et al., 2000), may function as a negative regulator, or perhaps in an ancillary role such as DNA repair, in particular hosts (Kapitonov et al., 2015; Kersulyte et al., 2000). OrfB-like genes are often associated with IS605/608-family transposons, and OrfB (TnpB) from ISDra2 has also been reported to function as a negative regulator (Pasternak et al., 2013).

Assembly and architecture of the paired-end complex

The first major step in a transposition reaction is formation of a paired-end (synaptic) complex leading to a chemically-active transpososome (Hickman and Dyda, 2015). We show that TnpAIS1535 binds in a robust and highly cooperative manner to multiple binding sites within the LE of the element and can efficiently recruit a second LE to generate a paired-end complex. Binding nucleates over four 9 bp directly-repeated motifs that are positioned in a helically-phased manner from about 20 to 60 bp from the IS1535 LE terminus (Figure 1B). Transposon sequences beginning at motif a (LE bp 21) through the conserved half of motif d (LE bp 54) are essential for efficient paired-end complex formation. However, additional non-specific DNA sequences extending to about 84 bp from the left end are required for efficient PEC assembly. Likewise, additional DNA extending from motif a to shortly beyond the transposon-host junction improves the efficiency of PEC assembly. Although this region contains motif e, which is spaced one bp closer to motif a than the spacing between motifs a-d, the presence of the motif e sequence has little discernable effect on PEC assembly. We find it surprising that the sequence identity of the terminal 19 bp of the LE does not significantly influence PEC formation.

Footprinting data on LE-TnpAIS1535 PEC complexes are consistent with the LE resections. TnpAIS1535 strongly binds over motifs a-d, but the overall region of binding extends from before motif e to about 75 bp from the LE terminus. Significantly, only very weak binding is evident at sequences surrounding the transposon-host junction where DNA chemistry must occur. PECs formed with LE substrates containing nonspecific sequences downstream of motif a actually exhibited greater protections from DNase I cleavage over the region that would be positioned at the transposon border, possibly implying that the native sequence near the LE terminus may be suboptimal for TnpA dimer binding. The boundaries of TnpA binding within LE PECs revealed by Exo III digestion support a model whereby multiple TnpA proteins coat long segments of the left ends. Initial Exo III stops indicate TnpA binding from 8 to 78 bp from the left end terminus. Profiles obtained upon increasing Exo III digestion are consistent with the exonuclease removing TnpA molecules bound to units of about 10 bp, revealing borders of the TnpA nucleoprotein filament from 8 and 18 (major) bp from the host junction extending to 78, 68 (major) and 59 bp within the element.

In contrast to the LE, the IS1535 RE is a poor substrate for TnpA binding. Only a small amount of RE-RE PECs or RE-LE PECs are detectable, although a complex 1 species is formed at high protein concentrations. Exo III footprinting shows that the RE complex 1 contains TnpA bound only to the two sequence motifs that are present between bp 10 and 28 from the RE-host junction. The significance of the differences in the IS1535 ends on its transposition reaction remains to be determined. However, the distribution of sequence motifs, along with our preliminary end binding experiments with TnpA from IS607 or ISC1926, suggests that this disparity is not present in these elements and thus may not be a general feature of IS607-family transposons.

The structures of the IS607-family TnpA catalytic domain dimers pose a number of questions with respect to how DNA binding and catalysis occur, especially in the light of the radically different oligomeric conformations of other SR family members. Whereas the catalytic domains of the smSRs oligomerize via interactions between their helix E regions, TnpA catalytic domains dimerize over their B and D helices. The helix E regions of TnpA dimers are split and folded into a physically separate four helix bundle that is attached to the catalytic domains by a flexible polypeptide linker. The helix E bundle would sterically exclude DNA from associating with the active site. Therefore, minimally, a reconfiguration of the helix E region would be required to enable DNA catalysis. In addition, the active site is located on the opposite side of the subunit from its DNA binding domain (see models below), raising the possibility that cleavage of the synapsed transposon end and/or target DNA may be in trans with respect to the DNA to which the N-terminal DBD is bound (Boocock and Rice, 2013), a recurring feature of transpososome structures (Hickman and Dyda, 2015). By contrast, smSRs cleave the half site to which they are bound (Boocock et al., 1995; Li et al., 2005). An additional paradoxical feature with respect to catalysis is that the active site serines in the TnpA dimer structures are separated by >25 Å, much too far to cleave on either side of the GG dinucleotide at the transposon ends and host target in a manner consistent with other SRs. These and other comparative features with smSRs make it likely that a large conformational change in the oligomeric structure of TnpA precedes DNA cleavage and exchange.

Nevertheless, we show here that the quaternary structure of the catalytic domain is active for assembling PECs, as evidenced by the robust formation of PECs by IS1535 TnpA dimers covalently crosslinked over the core subunit interface. However, the inability of dimers with covalently-linked E helices to cooperatively bind the LE and form PECs provides strong evidence that the helix E region does undergo conformational rearrangement during PEC assembly. An attractive model is that the helix E regions from adjacent dimers remodel to interact with each other during the cooperative loading of proteins along the transposon end. IS1535 TnpA proteins deleted for the entire helix E region appear competent to form PECs at very high protein concentrations, supporting a role for a remodeled helix E in promoting cooperative binding between dimers bound laterally along the transposon ends. The finding that a small amount of PECs appear to still form when the entire helix E region is deleted suggests that helix E is not directly required for synaptic interactions.

Proteolysis experiments and structure modeling suggests that winged-helix DNA binding domains are linked to the N-terminal end of the catalytic domain by peptide chains ranging in length from just three residues (ISC1926) to about 10 residues (IS1535). Structural models of DNA-bound TnpAISC1926 dimers, where there is predicted to be less conformational freedom between the DBD and catalytic domains, are shown in Figure 8. In panel A, the recognition α-helices of the two DBDs are inserted into the major groove of a DNA model of the IS1535 LE segment containing motifs a and b at positions consistent with protections from dimethyl sulfate reactivity at guanines by TnpAIS1535 (Figure 8—figure supplement 1). The N-termini of the catalytic domains of the TnpAIS1535 and TnpAISC1926 dimers are separated by a distance that is close to the pitch of B DNA. Thus, the two DBDs can readily fit into adjacent major groove segments on the same helical face even with the short three residue linker that is present in TnpAISC1926.

Figure 8 with 4 supplements see all
Models of TnpA binding to the IS1535 LE and the PEC.

(A) Model of a TnpA dimer in a configuration where the two DBDs are binding to LE motifs a and b (orange DNA) in a manner consistent with DMS protection data (Figure 8—figure supplement 1) and where the wing is over the A/T-rich minor groove. Guanine N7 atoms protected from DMS reactivity by bound TnpA are highlighted as blue spheres. The tandem G/C base pairs at the LE terminus are red and the host DNA is green. The TnpA dimer model is derived from the Phyre2 model of the TnpAISC1926 NTD (Figure 2—figure supplement 1C) linked by three residues to the TnpAISC1926 CTD X-ray structure. We posit this conformation on DNA represents complex 1 (see Figure 8—figure supplement 4). (B) TnpA dimer configuration where only one DBD is associated with a single end. The dimer is rotated orthogonally about the DBD-CTD linker in relation to the dimer in panel A. (C) Four TnpA dimers are bound as in panel B to motifs a-d on one LE. The helix E regions are proposed to engage in helix-swapped interactions between adjacent dimers (e.g., Figure 8—figure supplement 2) to promote cooperative binding. This structure, with additional dimers bound laterally along the LE, may reflect complex 2. (D) Model in panel (C) rotated to show the DNA in an end-on view, highlighting the set of unbound DBDs. (E) Model of the PEC with a second LE associated. Although represented as parallel straight DNAs, the two transposon ends may be in a more interwrapped structure. TnpA protomers in a different, chemically-active, conformation are proposed to be recruited to the end of the filament at the transposon-host junction.

https://doi.org/10.7554/eLife.39611.017

An alternative arrangement is shown in Figure 8B where only one subunit of the dimer binds to a motif on an individual transposon end, leaving the other subunit free to bind a second DNA. In this model, additional dimers would bind in a similar manner to adjacent motifs (Figure 8C) with binding stabilized by remodeling of the helix E regions to generate intermolecular contacts between dimers (e.g., Figure 8—figure supplement 2). This dimer binding configuration accounts for the cooperative assembly of TnpA units covering about 10 bp each, as evidenced in the Exo III footprints. Most importantly, it accounts for the near simultaneous recruitment of both DNA ends into a PEC; a TnpA dimer array on a single transposon end will have an array of appropriately spaced free DBDs (Figure 8D) ready to capture a second unbound transposon end with high affinity (Figure 8E). The parallel arrangement of ends in the PEC is consistent with PEC assembly by a substrate containing two inverted copies of IS1535 LEs separated by only 80 bp (Figure 8—figure supplement 3).

We suggest that complex 1, which is formed in vitro at high TnpA concentrations and does not appear to be a precursor to the PEC (Figure 3A–D), has a TnpA dimer bound in the conformation depicted in Figure 8A. In support of this, a mixture of IS1535 TnpA and TnpA-MBP dimers to a LE deletion substrate that only forms complex I generates only DNA-bound products representing the two homodimers (Figure 8—figure supplement 4). If complex 1 consisted of two dimers bound as in Figure 8B, a heterotetrameric species that would migrate between the complexes formed by the separate dimer reactions would also be expected.

Complex 2, which also contains only one transposon end, is observed only at very high concentrations of TnpA relative to transposon ends (Figure 3A,B). The presence of complex 2 is correlated with a decrease of PECs as the TnpA to transposon end ratio increases (Figure 3A,B,F). We propose that this complex has dimers bound along individual LEs as in Figure 8C,D, but unbound transposon ends are unavailable to generate a PEC, consistent with a model whereby TnpA bound to one end captures a second unbound end. Alternatively, complex 2 could have two dimers bound in the conformation in Figure 8A to motifs a-d.

Comparison with other transpososome structures

Many transposases, for example Tn5, function as dimers where each subunit associates with the short terminal inverted repeats of both DNA ends within the assembled transpososome (Davies et al., 2000). However, some transposases utilize multiple binding sites within longer end segments, and in some cases, contain distinct DNA binding domains. For example, each subunit of the Mos1 Mariner-family transposase dimer binds to one transposon terminus and additional separate DNA binding domains associate with two subterminal binding sites on the other transposon end within the active complex (Richardson et al., 2009). Assembly of the phage Mu transpososome, which contains four copies of the Mu A protein, is more complex as it involves interactions with a remote enhancer-like element by a distinct DNA binding domain (Harshey, 2014). hAT-family transposons often have many subterminal repeats at variable spacings and orientations, which in some elements, can be located hundreds of base pairs from the transposon termini (Atkinson, 2015). The transposase from the hAT element Hermes is a preassembled donut-shaped octamer that is proposed to bind to four subterminal repeats on each end along with the transposon termini using distinct DNA binding regions (Hickman et al., 2014). The presence of subterminal repeats on hAT-family elements bears some similarity to the sequence motifs of IS607-family transposon ends, but the manner of transposase binding and synapsis proposed here for IS1535 is different.

Current understanding of the IS607-family transposition reaction

We propose the following pathway for formation of the paired-end complex that is expected to be a critical early intermediate in the IS607-family transposition reaction. Multiple TnpA dimers are initially targeted to DNA motifs that can be located over 60 bp from the transposon end (Figure 1B). In the case of the IS1535 LE, four dimers cooperatively bind over four helically-phased motifs beginning 20 bp from the end (Figure 8C), and the nucleoprotein filament continues to spread in a largely sequence-neutral manner to cover at least 70 bp. The active sites are positioned well away from the DNA in the filament, avoiding any spurious cleavage. The single-end complex then captures an unbound end to generate a stable PEC (Figure 8D,E). Because formation of the PEC is relatively slow (Figure 3C,D), we imagine that initial binding to a IS1535 LE limits the rate of PEC formation, but once the single-end filament is formed, an unbound end is rapidly captured. Although we illustrate this complex as two parallel transposon ends, a more interwrapped structure may form. No quaternary change in the dimer structure over the catalytic core domain is required for PEC assembly, but our evidence indicates that the helix E region remodels to facilitate cooperative assembly of the nucleoprotein filament. Most all IS607-family transposon ends have multiple motifs (the IS1535 RE is an exception with only two), but they are not always spaced by increments of 10 – 11 bp (Figure 1B). The flexible peptide linkers between the DBD and catalytic domains and between the catalytic domains and helix E regions may enable similar nucleoprotein filaments to assemble, even if some of the recognition motifs are not in helical phase.

Whereas an 80 bp segment within the IS1535 LE beginning near the transposon terminus is required for maximally efficient PEC assembly, only very weak TnpA binding is observed over the transposon-host DNA junction where TnpA-mediated chemistry on DNA must occur. For the reasons discussed above, and because the active sites in the solution dimer are not positioned appropriately with respect to the terminal GG cleavage sites, an alternate conformation of TnpA is almost certainly required for DNA catalysis. Recruitment of TnpA in a catalytically-active conformer may be a key regulatory step and could require co-translational synthesis or folding localized to a preformed PEC (Duval-Valentin and Chandler, 2011). This could explain the requirement in vivo for the IS607 tnpA gene to be located close to the transposon ends for transposition to occur (Kersulyte et al., 2000) (W.C. and R.C.J., unpublished). Thus, correct assembly of the PEC, with the two transposon ends in correct register, may be a prerequisite or checkpoint for recruiting catalytically-active subunits to bind to the junction, or alternatively, for allosterically activating weakly bound subunits over the junction.

In addition to the structure of the recombination complex, the steps of DNA exchange by serine transposases may also be quite different from other SRs. A subunit rotation mechanism for strand exchange on the complex depicted in Figure 8E would lead to an intramolecular inversion, not transposition. Instead, we suggest that the element may excise from the donor site and then insert into a target locus using serine chemistry without strand transfer coupled to subunit rotation. It is also possible that capture of the target locus could be a prerequisite for DNA cleavage. Because both transposon ends need to recombine into a single GG target, it seems likely that the strand transfer reactions must occur sequentially. As none of these DNA cleavage–transfer steps would necessarily require a subunit rotation reaction, the structure of chemically-active TnpA oligomers may be very different from other SRs that have been trapped in tetrameric structures competent for DNA exchange by subunit rotation.

Materials and methods

Key resources table
Reagent type (species)
or resource
DesignationSource or referenceIdentifiersAdditional
information
Gene
(Mycobacterium tuberculosis)
IS1535 orfA/tnpAH37Rv
genome DNA
Gene ID: RV0921
Gene
(Helicobacter pylori)
IS607 orfA/tnpAsynthetic geneNCBI protein
ID: AAF05600.1
Gene
(Helicobacter pylori)
IS607 orfBsynthetic geneNCBI protein
ID: WP_001274345.1
Gene
(Sulfolobus islandicus)
ISC1926 orfA/tnpAS. islandicus genome
DNA, PMID: 15612937
NCBI protein
ID: AAV87873.1
S. islandicus
pyrE::ISC1926
Dennis Grogan, University of
Cincinnati
Strain,
strain background (E. coli)
RJ1224Laboratory collectionrecA56, srl, Δ(pro-lac), ara, rpsL,
λbbnincI857,b515, b519, nin5, Sam7]
Strain,
strain background (E. coli)
Hfl-1PMID: 4352176hfl-1, fhuA2::IS2,
lacY1, tsx-1, glnX44,
gal-6, xyl-7, mtlA2, mut-14
Strain,
strain background (E. coli)
LE392PMID: 6291786hsdR514 (rk–, mk+),
glnX (supE44), tyrT (supF58), Δ(codB-lacI)3, galK2, galT22, metB1, trpR55
Strain,
strain background (E. coli)
BW14879PMID: 2160940pMW11 Muc62 Δ(lac)X74,
Δ(phoA532 Pvull)
phn(EcoB),
arcA1655, fnr-1655
B. Wanner, Purdue University
Strain,
strain background (E. coli)
BW5104PMID: 2160940Mu-1 Δlac169,
creB510, hsdR514
B. Wanner, Purdue University
Strain,
strain background (E. coli)
RJ3960This workBW5104 λR mal
Strain, strain
background (E. coli)
RJ3388Laboratory collectionBL21 (DE3) endA::tet8, fis::str/spc-985
Strain, strain
background (E. coli)
RJ3431Laboratory collectionBL21 (DE3) metC::Tn10
Recombinant
DNA reagent
See supplementary file 2
Sequence-based reagentSee supplementary file 3
Peptide,
recombinant protein
DNase IThermo Fisher,
Waltham, MA
Catalog number: EN0521
Peptide,
recombinant protein
Exonuclease IIINEB, Ipswich,
MA
Catalog number: M0206L
Peptide,
recombinant protein
Proteinase KRoche,
Germany
Catalog number: 03115828001
Peptide,
recombinant protein
TrypsinPromega,
Madison, WI
Catalog number: V511A
Commercial
assay or kit
Sequenase Quick-Denature
Plasmid Sequencing Kit
Affymetrix, Santa Clara, CACatalog number: 70140
Commercial
assay or kit
Coomassie ProteinAssay ReagentThermo Fisher, Waltham, MACatalog number: 1856209
Chemical
compound, drug
Dimethyl sulfateThermo Fisher, Waltham, MACatalog number: AC430831000
Chemical
compound, drug
PiperidineSigma-AldrichCatalog number: 10409–4
Chemical
compound, drug
DiamideSigma-AldrichCatalog number: 87751
Chemical
compound, drug
AEBSFGold BiotechnologyCatalog number: A-540–1
Software,
algorithm
ImageQuantGE HealthcareRRID:SCR_014246
Software,
algorithm
PyMOL Molecular Graphics SystemSchrodinger, LLCRRID:SCR_000305https://pymol.org/2/
Software,
algorithm
Protein Prospector/MS-Digesthttp://prospector.ucsf.edu/prospector/cgi-bin/msform.cgi?form=msdigestRRID:SCR_014558
Software,
algorithm
Phyre2PMID: 25950237RRID:SCR_010270www.sbg.bio.ic.ac.uk/phyre2/
Software,
algorithm
XDSPMID: 20124692RRID:SCR_015652http://xds.mpimf-heidelberg.mpg.de/
Software,
algorithm
PHASERPMID: 19461840RRID:SCR_014219
Software,
algorithm
SHELXdoi.org/10.1107/S0021889804018047RRID:SCR_014220
Software,
algorithm
CootPMID: 15572765RRID:SCR_014222https://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/
Software,
algorithm
PhenixPMID: 20124702RRID:SCR_014224https://www.phenix-online.org/
Software,
algorithm
BusterPMID: 22505257RRID:SCR_015653https://www.globalphasing.com/buster/
Software,
algorithm
CCP4PMID: 21460441RRID:SCR_007255http://www.ccp4.ac.uk/
Software,
algorithm
Procheckdoi.org/10.1107/S0021889892009944RRID:SCR_006511https://www.ebi.ac.uk/thornton-srv/software/PROCHECK/
Software,
algorithm
Clustal Omegahttps://www.ebi.ac.uk/Tools/msa/clustalo/RRID:SCR_001591

Strains and plasmids

Request a detailed protocol

E. coli strain genotypes are given in Supplementary file 1. ISC1926 was amplified from S. islandicus pyrE::ISC1926 genomic DNA (gift of D. Grogan), and IS1535 was amplified from Mycobacterium tuberculosis H37Rv genomic DNA (gift of D. Eisenberg). Synthetically-derived IS607 orfA and orfB sequences (E. coli codon optimized, Genewiz, South Plainfield, NJ), along with all plasmids used in this work and details of their constructions, are given in Supplementary file 2 and 3.

Transposition assays

Request a detailed protocol

RJ1224 (recA λbbnin cI857 b515 b519 nin5 Sam7]) containing pBR322 with an IS607-tet derivative was grown at 30°C in 2 x YT and 10 µg/ml tetracycline. λ lysates were obtained upon shifting the culture to 42°C for 20 min and then to 37°C for 3 hr to allow phage development. Lysates were titered on LE392 (supF) and used to transduce early stationary phase LB cultures of the high frequency lysogenizing strain Hfl-1 (Belfort and Wulff, 1973) at a multiplicity of infection of about 0.3. After 20 min at 30°C, 2 volumes of LB were added, and incubation continued for 60 min. Cells were plated onto LB +10 µg/ml tetracycline, and the number of TetR (AmpS, temperature-sensitive) transductants per plaque forming unit (PFU) were scored as transposition events.

λ genomic fragments containing IS607-tet were transferred to plasmids for DNA sequencing by the in vivo mini-Mu cloning method (Groisman and Casadaban, 1986). Lysates of the Hfl-1 λbbnin::IS607-tet transductants were used to lysogenize BW14879 containing the mini-Mu cloning plasmid pMW11 (str/spcR) and Muc62 (Metcalf et al., 1990). Mini-Mu lysates were prepared by thermal-induction and used to infect RJ3960, a λR derivative of BW5104 selected as a maltose non-fermenting survivor after λcI- b221 infection. After growth for 60 min at 30°C the cells were plated on LB +10 µg/ml tetracycline and 25 µg/ml streptomycin. Plasmid DNA from TetR, StrR colonies were sized on agarose gels, and plasmids < 15 kb were subjected to DNA sequencing using primer oRJ878 that reads out from the left end of IS607. The sequence identified the insertion position on the λ genome, and insertion-specific λ primers flanking the transposon were then used to amplify the region from the original λ::IS607-tet lysate as a template and to sequence the right junction using primer oRJ879 that reads out from the right end of IS607-tet. All amplicon sizes were consistent with simple insertions.

Purification of TnpA and TnpA-CTD

Request a detailed protocol

TnpA proteins were expressed in RJ3388 in 2xYT at OD600 = 1 with 0.4 mM IPTG for ~16 hr at 15°C. Cells expressing full-length proteins were lysed in 25 mM MES-NaOH, pH 6.0, 300 mM NaCl, 5 mM β-mercaptoethanol (βME), 5 mM EDTA, and 10% glycerol by three passes through a French Press. Clarified extracts were batch incubated with SP Sepharose Fast Flow resin (GE Healthcare, Chicago, Illinois) for 2 hr at 4°C, the resin was washed extensively with lysis buffer containing 400 mM NaCl, and protein was eluted with 50 mM HEPES, pH 7.5, 1 M NaCl, 10% glycerol, and 5 mM βME. The partially purified TnpA was then bound to Ni-NTA agarose (Goldbio, St. Louis, Missouri) in the same buffer. The resin was washed with Buffer A (25 mM HEPES, pH 7.5, 1 M NaCl, 5 mM βME, and 10% glycerol) + 50 mM imidazole, and TnpA was eluted with Buffer A + 500 mM imidazole. Batch chromatography was used to avoid protein precipitation upon elution. TnpA was dialyzed into storage buffer (25 mM HEPES, pH 7.5, 1 M Na acetate, 5 mM βME, and 50% glycerol) and stored at −20°C or at −80°C after quick freezing.

RJ3388 expressing the TnpA-CTD were lysed by French Press in Buffer A (50 mM MOPS, pH 7.0, 1 M NaCl, 25 mM imidazole, 5 mM βME, and 10% glycerol). For Se-methionine (Se-met) labeling, RJ3431 (metC) containing pRJ3347 was grown in M9 glucose +20 µg/ml methionine to an OD600 = 1.5. Cells were chilled and transferred to M9 glucose, incubated for 20 min at 15°C followed by addition of 60 µg/ml Se-met and 0.4 mM IPTG. Clarified extracts were incubated with Ni-NTA resin, washed in Buffer A + 50 mM imidazole, and eluted in Buffer B (50 mM MES-NaOH, pH 6.0, 5 mM βME, 10% glycerol) plus 500 mM NaCl and 250 mM imidazole. Eluted proteins were mixed with an equal volume of Buffer B and then incubated with SP Sepharose Fast Flow for 2 hr. The resin was washed with Buffer B + 400 mM NaCl, and near pure TnpA-CTD eluted in Buffer B + 1 M NaCl. The protein was concentrated with an Amicon Ultra-15 centrifugal filter (3 K Da cutoff, MilliporeSigma, Burlington, MA) and applied to a Superdex 75 (16/600; GE Healthcare) column on an FPLC in Buffer B + 1 M NaCl. The peak fractions containing the CTD were pooled, exchanged into crystallization buffer, and concentrated.

Domain mapping by limited proteolysis and mass spectrometry

Request a detailed protocol

IS607 and IS1535 TnpA (3 µg) were incubated in 20 µl 25 mM HEPES (pH 7.5), 300 mM NaCl, 10% glycerol and 5 mM 2-mercaptoethanol at 37°C with 50 ng trypsin (Promega, Madison, WI) for varying times up to 30 min. TnpAISC1926 reactions were identical except that 10 mM CaCl2 was included in the cleavage buffer, and 100 ng of trypsin was added. Proteolysis reactions were quenched with 5 mM AEBSF (4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride; Sigma-Aldrich, St. Louis, MO), and subjected to 18% SDS-PAGE in Tricine buffer with 10% glycerol in the separating gel and stained with Coomassie Blue. Aliquots were analyzed by MALDI-TOF-MS on an Applied Biosystems Voyager DE-STR instrument operated in positive ion mode with and without the reflectron. Upon testing several matrices, sinapinic acid was found to yield the best mass spectra. Peptide molecular weights were compared to all trypsin cleavage products calculated for the protein using the MS-Digest tool in Protein Prospector (http://prospector.ucsf.edu/prospector/cgi-bin/msform.cgi?form=msdigest) to determine most likely endpoints.

Electrophoretic (gel) mobility shift assays

Request a detailed protocol

TnpAIS1535 binding reactions were performed in 20 µl 25 mM HEPES, pH 7.5, 150 mM Na acetate, 5 mM Mg acetate, 1 mM DTT, 500 µg/ml BSA, 5% glycerol, 25 µg/ml sonicated salmon sperm DNA (Rockland, Limerick, PA)+1 nM 32P-labeled DNA probe. DNA probes were generated by PCR with LE or RE specific primers using pRJ3234 (IS1535 LE) or pRJ3348 (IS1535 RE) as the template (Supplementary file 2 and 3) and PAGE purified. The standard 149 bp LE probe (91 bp LE side, 58 bp host side) used in Figures 3 and 7 was generated using oRJ839 and oRJ840. A portion was end-labeled with γ-32P-ATP (Perkin Elmer, Waltham, MA) and polynucleotide kinase (NEB) and free label removed with a G-50 Micro column (GE Healthcare). Labeled probe was added to unlabeled probe to generate 1 nM in the binding reaction. Freshly diluted TnpA in 25 mM HEPES, pH 7.5, 1 M Na acetate, 1 mM DTT, 500 ug/ml BSA, and 20% glycerol was added to the binding mixture and typically incubated at 37°C for 60 min before applying to a 6% polyacrylamide (acrylamide:bisacrylamide 37.5:1) in 25 mM Tris-acetate, pH 7.5, and 1 mM Mg acetate (gel and running buffer). Electrophoresis was typically at 3.5 v/cm for 12 hr at 23°C. TnpAIS1535 proteins were oxidized for disulfide crosslinking by incubation at 4°C overnight in 25 mM HEPES, pH 7.5, 1 M Na acetate, 20% glycerol and 0.2 mM diamide, and the binding buffer contained 0.2 mM diamide in place of DTT. A Typhoon phosphorimager was used for image acquisition, and analysis was performed with ImageQuant (GE Healthcare).

Nuclease and chemical probing of TnpA complexes

Request a detailed protocol

Binding reactions were the same as for the EMSAs except that the labeled probe was generated by amplifying pRJ3234 (IS1535 LE), pRJ3348 (IS1535 RE) (Figure 4 and Figure 8—figure supplement 1) or pRJ3352 (Figure 6) with 5′-labeled oRJ880 or oRJ881. After 60 min incubation at 37°C with TnpAIS1535, DNase I (0.02 u, Thermo Fisher, Waltham, MA) or Exonuclease III (10 u, NEB, Ipswich, MA) was added for 30 s or 5 min, respectively. Reactions were quenched with 150 mM Tris-HCl, pH 8.5, 10 mM CDTA, 0.8% SDS, and 12.5 µg/ml proteinase K and incubated 10 min at 65°C. The DNA was ethanol-precipitated, dissolve in formamide-NaOH dye and electrophoresed through 6% acrylamide-urea sequencing gels in TBE. Dimethyl sulfate reactions (10 mM, 30 s) under the same binding conditions and DNA cleavage with piperidine were performed essentially as described (Shaw and Stewart, 1994). Sequence ladders were generated using the Sequenase Quick-Denature Plasmid Sequencing Kit (Affymetrix, Santa Clara, CA).

Crystallization and structure determination

Request a detailed protocol

The best diffracting crystals of TnpAISC1926 CTD were obtained using the hanging drop method by mixing equal volumes of a 10 mg/ml protein solution in 20 mM MOPS, pH 7.0, 100 mM Na-acetate, 0.1 mM DTT with a reservoir solution containing 8% (v/v) tacsimate, pH 4.0, and 20% (w/v) PEG3350. Crystals grew at 25°C, and although additional cryoprotectants were screened, they show no increase in diffraction relative to the drop solution alone. For TnpAIS1535 CTD, optimal crystals were grown by mixing equal volumes of a 5 – 9 mg/ml protein solution in 0.3 M sodium acetate, pH 5.0, and 1 mM TCEP with a reservoir solution containing 0.2 M sodium citrate + 20% (w/v) PEG3350. Crystals were cryoprotected in reservoir solution plus 30% glycerol.

All X-ray diffraction data were collected at 100 K at the Advanced Photon Source (Chicago IL) beamline 24-ID-C on a DECTRIS-PILATUS 6M detector. TnpAISC1926 CTD data were collected to 2.9 Å and integrated and scaled with XDS (Kabsch, 2010). The phases were solved by molecular replacement with PHASER (McCoy et al., 2007) using 3LHK chain D as the search model. Model building and refinement were performed using Coot (Emsley and Cowtan, 2004), PHENIX (Adams et al., 2002), and BUSTER (Smart et al., 2012). TnpAIS1535 CTD native and Se-met data were both collected to 2.5 Å resolution, and integrated and scaled using XDS. MAD phases were calculated from six selenium atoms with HKL2MAP (Pape and Schneider, 2004). Automatic model building was performed with BUCCANEER (Winn et al., 2011), which traced approximately 90% of the two chains. This model was then used to continue model building and refinement on the native dataset using Coot and BUSTER. X-ray data and refinement statistics are given in Table 1; the PDB code for the TnpAISC1926 CTD is 6DGC and for TnpAIS1535 CTD is 6DGC. Molecular graphics images of the structures were produced with PyMOL (Schrödinger, https://pymol.org/2/).

Modeling

Request a detailed protocol

Structure models of the N-terminal domains were generated by Phyre2 (Kelley et al., 2015). A structural model of an intact TnpAISC1926 dimer was generated from the Phyre2 model of the TnpAISC1926 NTD (residues 12 – 61, Figure 2—figure supplement 2C) linked to residue 65 of the CTD by the native residues Arg-Glu-Glu using Coot. The NTD was docked onto a DNA model (3DNA, [Lu and Olson, 2003]) of the IS1535 LE sequence with the aid of the closely related RacA-DNA complex (Figure 2—figure supplement 2C) and DMS protection data (Figure 8—figure supplement 1) using PyMOL and Coot.

References

  1. 1
  2. 2
    hAT transposable elements
    1. PW Atkinson
    (2015)
    Microbiology Spectrum, 3, 10.1128/microbiolspec.MDNA3-0054-2014, 26350319.
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
    Genetic and biochemical investigation of the Escherichia coli mutant hfl-1 which is lysogenized at high frequency by bacteriophage lambda
    1. M Belfort
    2. DL Wulff
    (1973)
    Journal of Bacteriology 115:299–306.
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
    Coot: model-building tools for molecular graphics
    1. P Emsley
    2. K Cowtan
    (2004)
    Acta Crystallographica Section D Biological Crystallography 60:2126–2132.
    https://doi.org/10.1107/S0907444904019158
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
    Transposable phage Mu
    1. RM Harshey
    (2014)
    Microbiology Spectrum, 2, 10.1128/microbiolspec.MDNA3-0007-2014, 26104374.
  23. 23
    The IS200/IS605 Family and "Peel and Paste" Single-strand Transposition Mechanism
    1. S He
    2. A Corneloup
    3. C Guynet
    4. L Lavatine
    5. A Caumont-Sarcos
    6. P Siguier
    7. B Marty
    8. F Dyda
    9. M Chandler
    10. B Ton Hoang
    (2015)
    Microbiology Spectrum, 3, 10.1128/microbiolspec.MDNA3-0039-2014, 26350330.
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
    Initial sequencing and analysis of the human genome
    1. International Human Genome Sequencing Consortium
    2. ES Lander
    3. LM Linton
    4. B Birren
    5. C Nusbaum
    6. MC Zody
    7. J Baldwin
    8. K Devon
    9. K Dewar
    10. M Doyle
    11. W FitzHugh
    12. R Funke
    13. D Gage
    14. K Harris
    15. A Heaford
    16. J Howland
    17. L Kann
    18. J Lehoczky
    19. R LeVine
    20. P McEwan
    21. K McKernan
    22. J Meldrim
    23. JP Mesirov
    24. C Miranda
    25. W Morris
    26. J Naylor
    27. C Raymond
    28. M Rosetti
    29. R Santos
    30. A Sheridan
    31. C Sougnez
    32. Y Stange-Thomann
    33. N Stojanovic
    34. A Subramanian
    35. D Wyman
    36. J Rogers
    37. J Sulston
    38. R Ainscough
    39. S Beck
    40. D Bentley
    41. J Burton
    42. C Clee
    43. N Carter
    44. A Coulson
    45. R Deadman
    46. P Deloukas
    47. A Dunham
    48. I Dunham
    49. R Durbin
    50. L French
    51. D Grafham
    52. S Gregory
    53. T Hubbard
    54. S Humphray
    55. A Hunt
    56. M Jones
    57. C Lloyd
    58. A McMurray
    59. L Matthews
    60. S Mercer
    61. S Milne
    62. JC Mullikin
    63. A Mungall
    64. R Plumb
    65. M Ross
    66. R Shownkeen
    67. S Sims
    68. RH Waterston
    69. RK Wilson
    70. LW Hillier
    71. JD McPherson
    72. MA Marra
    73. ER Mardis
    74. LA Fulton
    75. AT Chinwalla
    76. KH Pepin
    77. WR Gish
    78. SL Chissoe
    79. MC Wendl
    80. KD Delehaunty
    81. TL Miner
    82. A Delehaunty
    83. JB Kramer
    84. LL Cook
    85. RS Fulton
    86. DL Johnson
    87. PJ Minx
    88. SW Clifton
    89. T Hawkins
    90. E Branscomb
    91. P Predki
    92. P Richardson
    93. S Wenning
    94. T Slezak
    95. N Doggett
    96. JF Cheng
    97. A Olsen
    98. S Lucas
    99. C Elkin
    100. E Uberbacher
    101. M Frazier
    102. RA Gibbs
    103. DM Muzny
    104. SE Scherer
    105. JB Bouck
    106. EJ Sodergren
    107. KC Worley
    108. CM Rives
    109. JH Gorrell
    110. ML Metzker
    111. SL Naylor
    112. RS Kucherlapati
    113. DL Nelson
    114. GM Weinstock
    115. Y Sakaki
    116. A Fujiyama
    117. M Hattori
    118. T Yada
    119. A Toyoda
    120. T Itoh
    121. C Kawagoe
    122. H Watanabe
    123. Y Totoki
    124. T Taylor
    125. J Weissenbach
    126. R Heilig
    127. W Saurin
    128. F Artiguenave
    129. P Brottier
    130. T Bruls
    131. E Pelletier
    132. C Robert
    133. P Wincker
    134. DR Smith
    135. L Doucette-Stamm
    136. M Rubenfield
    137. K Weinstock
    138. HM Lee
    139. J Dubois
    140. A Rosenthal
    141. M Platzer
    142. G Nyakatura
    143. S Taudien
    144. A Rump
    145. H Yang
    146. J Yu
    147. J Wang
    148. G Huang
    149. J Gu
    150. L Hood
    151. L Rowen
    152. A Madan
    153. S Qin
    154. RW Davis
    155. NA Federspiel
    156. AP Abola
    157. MJ Proctor
    158. RM Myers
    159. J Schmutz
    160. M Dickson
    161. J Grimwood
    162. DR Cox
    163. MV Olson
    164. R Kaul
    165. C Raymond
    166. N Shimizu
    167. K Kawasaki
    168. S Minoshima
    169. GA Evans
    170. M Athanasiou
    171. R Schultz
    172. BA Roe
    173. F Chen
    174. H Pan
    175. J Ramser
    176. H Lehrach
    177. R Reinhardt
    178. WR McCombie
    179. M de la Bastide
    180. N Dedhia
    181. H Blöcker
    182. K Hornischer
    183. G Nordsiek
    184. R Agarwala
    185. L Aravind
    186. JA Bailey
    187. A Bateman
    188. S Batzoglou
    189. E Birney
    190. P Bork
    191. DG Brown
    192. CB Burge
    193. L Cerutti
    194. HC Chen
    195. D Church
    196. M Clamp
    197. RR Copley
    198. T Doerks
    199. SR Eddy
    200. EE Eichler
    201. TS Furey
    202. J Galagan
    203. JG Gilbert
    204. C Harmon
    205. Y Hayashizaki
    206. D Haussler
    207. H Hermjakob
    208. K Hokamp
    209. W Jang
    210. LS Johnson
    211. TA Jones
    212. S Kasif
    213. A Kaspryzk
    214. S Kennedy
    215. WJ Kent
    216. P Kitts
    217. EV Koonin
    218. I Korf
    219. D Kulp
    220. D Lancet
    221. TM Lowe
    222. A McLysaght
    223. T Mikkelsen
    224. JV Moran
    225. N Mulder
    226. VJ Pollara
    227. CP Ponting
    228. G Schuler
    229. J Schultz
    230. G Slater
    231. AF Smit
    232. E Stupka
    233. J Szustakowki
    234. D Thierry-Mieg
    235. J Thierry-Mieg
    236. L Wagner
    237. J Wallis
    238. R Wheeler
    239. A Williams
    240. YI Wolf
    241. KH Wolfe
    242. SP Yang
    243. RF Yeh
    244. F Collins
    245. MS Guyer
    246. J Peterson
    247. A Felsenfeld
    248. KA Wetterstrand
    249. A Patrinos
    250. MJ Morgan
    251. P de Jong
    252. JJ Catanese
    253. K Osoegawa
    254. H Shizuya
    255. S Choi
    256. YJ Chen
    257. J Szustakowki
    (2001)
    Nature 409:860–921.
    https://doi.org/10.1038/35057062
  29. 29
  30. 30
  31. 31
  32. 32
    XDS
    1. W Kabsch
    (2010)
    Acta Crystallographica. Section D, Biological Crystallography 66:125–132.
    https://doi.org/10.1107/S0907444909047337
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
    Identification of Protein-DNA Contacts with Dimethyl Sulfate
    1. PE Shaw
    2. AF Stewart
    (1994)
    In: G. G Kneale, editors. DNA-Protein Interactions: Principles and Protocols. New York:  Springer. pp. 79–87.
    https://doi.org/10.1385/0-89603-256-6:79
  50. 50
  51. 51
  52. 52
    Phage-encoded serine integrases and other large serine recombinases
    1. MCM Smith
    (2015)
    Microbiology Spectrum, 3, 10.1128/microbiolspec.MDNA3-0059-2014, 26350324.
  53. 53
  54. 54
    The serine recombinases
    1. WM Stark
    (2014)
    Microbiology Spectrum, 2, 10.1128/microbiolspec.MDNA3-0046-2014, 26104451.
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63

Decision letter

  1. Stephen C Kowalczykowski
    Reviewing Editor; University of California, Davis, United States
  2. Gisela Storz
    Senior and Reviewing Editor; National Institute of Child Health and Human Development, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Multiple serine transposase dimers assemble the transposon-end synaptic complex during IS607-family transposition" for consideration by eLife. Your article has been reviewed by Gisela Storz as the Senior Editor, a Reviewing Editor, and three reviewers. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

This is an important and well-executed paper on IS607 family transposition. Both the experimental work and its description are well-done, clear, and of a high standard. The manuscript represents an important contribution to the field.

Overall, the quality and thoroughness of the science in this paper is first rate. Some revision based on the comments of the reviewers will strengthen the manuscript in how the results are interpreted and how the models are to be favored or disfavored. The work sets the stage for further testing of models and examining mechanisms for IS607 family transposition. Although the list of comments is extensive, that authors should note that none of the reviewers have insisted on any further experiments by the authors; if they can supply any more experimental information that they may have on the points made by the reviewers (e.g., the activity of LE-LE complexes), then addition of such data would improve the paper, but is not essential.

Otherwise, despite the longer than normal list of comments below, the authors should be aware that most of the questions arose from an interest in their work, rather than from skepticism of their conclusions. In the end, all of the changes required are simply additions or clarifications of the text (e.g., the Discussion section). Consequently, it should be relatively easy for the authors to modify the text to accommodate these suggestions and improve the manuscript within two months.

[Please note that it was difficult to summarize the reviewers' comments without including most of the full review to provide the proper context of their comments. Consequently, what follows is an atypical eLife review of a manuscript that is in principle acceptable for publication and, instead of an editorial summary, contains the verbatim comments of the reviewers. The discussion amongst the reviewer is a summary of the major points that should be clarified in the revised manuscript, although the authors should consider the full scope of the reviewers’ comments.]

Summary of reviewers' discussion and the specific comments to be addressed (please see the full set of comments by the reviewers for context):

1) Both reviewers 1 and 3 were in complete accord with principal comments 1-3 as expressed by reviewer 1 (see related comments by reviewer 3).

2) Both reviewers 1 and 3 agree that the foot-printing data is extensive and thorough, and characterizes the interactions of TnpA dimers with individual binding elements within each transposon end, and the allosteric/cooperative contributions to binding. One concern is whether this binding characterization is sufficient to draw strong conclusions regarding paired end complex (PEC) formation (or synapsis of the ends), which is the principal question that the authors set out to solve.

3) Both reviewers 1 and 3 found the model for cooperative TnpA assembly at the left end followed by capture of the unbound right end somewhat unsubstantiated by the experimental data. Of course, this is a model at this stage. Both reviewers 1 and 3 were not sure that the present results rule out PEC formation by TnpA-bound left and right ends, with the pairing stabilizing the TnpA dimers bound to the right end. This should be addressed or the conclusions qualified.

4) Both reviewers 1 and 3 thought that the authors may want to restate their position on why they consider rotation unlikely rather than rule out this possibility. Although they agree with the authors that subunit rotation is unlikely, they nonetheless, agree that subunit rotation cannot be dismissed as a mechanism based on the data presented. Perhaps the authors could expand this discussion a bit by explaining their reasoning and noting that their data do not rule it out. In this context, the authors should consider: (1) The dimerization interface does not involve helix-E and therefore subunit rotation would have to occur via a different interface compared to superfamily members with C-terminal DBDs. (2) I don't see why anything needs to rotate when the transposon is excised from the host.

5) Both reviewers 2 and 3 agree that the comments on Figure 8 by reviewer 2 models based on affinity differences in binding.

6) Both reviewers 2 and 3 agree thought that the role (or lack thereof) for helix E in PEC formation is worth consideration from the structure perspective.

7) The question of the functional competence of identical sequences at left and right ends of the transposon placed in the proper orientation is a relevant question that all three reviewers raise.

8) Finally, an additional topic that the authors may want to briefly discuss at the end is how their model might accommodate binding to target DNA. Presumably, another catalytic dimer equivalent needs to be provided. This would be speculative, but would relate the results of the paper back to the overall transposition pathway for readers.

Reviewer #1:

Chen et al. report characterization of the transposase-DNA interactions involved in transposition by members of the IS607 family of transposons, along with two new structures of transposase catalytic domains.

This is an excellent manuscript which provides substantial and much-needed information about how IS607-family serine transposases interact with their transposon ends to promote strand transfer. The work is presented very clearly and the manuscript is very well written throughout. Also, some difficult experimental work has been carried out to a high standard – in particular, the footprinting data are very nice. I think that the manuscript will be an important contribution to the field that will be very useful to others who wish to investigate these unusual transposition systems further. I have some comments on specific issues which I hope the authors will address (see below), but these are all of a relatively minor nature.

1) Much of the biochemical analysis is done on an IS1535 paired-ends complex (PEC) comprising two left ends (LE) bound together by the transposase. However, one would expect that for natural transposition, the active complex would be LE-RE, which is apparently less stable in the authors' assays. Is there any evidence that the LE-LE PEC is an active intermediate? For example, can transposition of an artificial element with two LEs be observed?

2) Results section and Discussion section. The authors propose that a single-end complex captures an unbound end to generate a PEC. Could the authors make their justification of this hypothesis clearer? Can they exclude both ends being bound by transposase? If their proposal is correct, formation of the PEC should be inhibited by higher concentrations of TnpA.

3) Last paragraph of Discussion section. The authors propose that excision and integration by this family of transposases might not involve a 'subunit rotation' mechanism as is proposed for related serine site-specific recombinases – which may be true, but I don't think that any of the data presented here argue strongly against a rotation mechanism.

Reviewer #2:

This is an interesting paper that represents an advance in understanding how IS607 family transposition occurs. The experimental support for formation of a TnpA nucleoprotein filament on LE DNA and a much smaller complex on RE DNA are strong. The data showing highly cooperative PEC formation between two LEs is also convincing.

The models in Figure 8 suggest that the IS607 serine recombinases acting at 'accessory sites' do so in a manner that is quite different from what the catalytic dimers must be doing. This isn't the case for the other small SR dimers, which all bind DNA in roughly similar ways, regardless of whether or not they are catalytic. Since the catalytic dimers of TnpA likely function as well-positioned, but non-specific nucleases in the excision reaction, perhaps this explains why the active sites are kept away from DNA on the filament (according to the model proposed).

One implication of the Figure 8 models is that 8A must not be much higher affinity vs 8B. If this weren't the case, I don't see how a LE complex could ever form; the intramolecular wHTH-DNA interactions would dominate. It seems like this should be a testable prediction.

I was surprised that the E-helices are not required for PEC formation; this was an important result that limits the types of interactions likely to occur on the filament vs between LE and RE.

With a strong focus on complexes formed on and between LE DNA, I was surprised that the activity of LE alone was not mentioned. Is a LE-TnpA-tet-LE(inv) construct active or toxic in vivo? Is there any sign of cleavage or covalent intermediate in the PECs formed at high TnpA?

Reviewer #3:

This paper by Chen et al., is a detailed structural and biochemical analyses of the DNA-protein interactions involved in the assembly of the high-order structures that mediate IS607 family transposition. This family of transposons deviates from the classical definition of transposons in that they do not harbour the canonical 'inverted repeat sequences' at their termini, and do not generate target site duplications at their insertion points. Transposition requires a single protein dimer (the transposase), coded for the transposase, that forms functional oligomers on cognate DNA elements. The catalytic pocket of the transposase is typical of well characterized serine site-specific recombinases, which carry out the strand cleavage and strand joining steps of recombination by transesterification chemistry. By contrast, the classical transposases perform strand cleavage by hydrolysis, and strand transfer (joining) by transesterification. The consensus dinucleotide (GG) at the insertion site in the recipient DNA (also the left and right junctions of the insertion) is consistent with the typical serine recombination mechanism in which the exchanged region is 2 bp long.

The principal conclusion from this study is that oligomeric assembly of the transposase at one end of the transposon containing multiple binding elements captures the other end containing fewer such elements to form a paired end complex (PEC), or a 'transpososome'. PEC formation does not require a change in the dimeric structure of the C-terminal catalytic domain, even though some remodeling of the E-helix within this domain seems to be necessary. The chemical activity of the transposase would involve some significant movement within the observed dimer structure (without DNA) in which the catalytic serine residues are too far apart to attack their target phosphodiester bonds.

This is a well written paper with an extensive set of data presented clearly in figures and text, and interpreted carefully. The structural view of the transposition complex emerging from the present studies will certainly provide the foundation for further testing mechanistic models for the chemical steps of the reaction carried out by this complex. The following comments may be considered if the authors wish to view their system from the broader perspective of phosphoryl (nucleotidyl) transfer by transesterification chemistry rather than from the narrower perspective of transposition.

1) The lack of target DNA duplication in the IS607 family is interesting from a purely transposition point of view, and based on precedent. The length of the duplication reflects the spacing of the nucleophiles (3'-hydroxyls in canonical transposition) and in turn the staggered insertion of the transposon ends at the target site. Depending on the extent of replication/repair, the nucleotides spanning the gap generated by the insertion, or these nucleotides plus the entire transposon, can be duplicated to give either a simple insertion or a co-integrate. Since the length of the duplication varies among different families, a few bp to as many as ~30 bp for CRISPRs, why not think of the IS607 family as a '0 bp' duplication family?

2) In some sense, the differences between conservative site-specific recombination, DNA transposition and even homologous recombination are often less marked than they are purported to be, if considered from a purely mechanistic standpoint. Given the rather limited chemical repertoire available for biological catalysis, how many different ways can one break or form a phosphodiester bond in DNA or RNA-rather few. Topoisomerases, conservative site-specific recombinases, enzymes that carry out homologous recombination and DNA/RNA polymerases illustrate he limited number of ways that this chemistry can be performed under different biological contexts. Juggling these mechanisms to reach similar genetic outcomes, or using the same mechanism to bring about distinct genetic rearrangements, is expected to be the rule rather than the exception. The authors may wish to tone down their characterization of IS607 family as an outlier among transposons.

3) Given that the transposase uses a serine nucleophile-based mechanism, the strand cleavage and the strand transfer steps of transposition must follow transesterification chemistry. Hence the reaction is more akin to site-specific recombination. One could think of the insertion of the transposon as recombination-mediated cassette exchange (RMCE), although a non-standard one. The donor cassette (transposon) is flanked by two recombination sites as in normal RMCE, but the target site has only one equivalent site, so the excised cassette in exchange for transposon insertion is a '0 bp cassette' (on par with 0 bp target duplication during transposition).

I believe that the 'cut-and-paste' mechanism that authors propose in the 'Discussion' is more or less the same as the non-reciprocal RMCE. An implication of this mechanism is that the donor DNA is likely to be repaired efficiently without addition or loss of DNA. Is this true from in vivo assays? It is probably known and mentioned here, and I might have somehow missed it.

Specific Comments:

1) Summary: Is the last sentence 'We posit-PEC recruits a chemically-active conformer of TnpA -.' justified by the data? Is it not possible that the TnpA dimer (tetramer?) adjacent to the scissile phopshates acquires cleavage competence within the assembled PEC? The sentence, as it stands now, suggests that there are inactive and active TnpA dimers in solution and the assembled PC selectively enlists active dimers to the cleavage site.

2) Figure 2 and Figure 3 (figure supplements) and text on structural data. The domain characterization of TnpA, the structural determination (C-terminal domain) and the relevant comparisons to serine recombinase catalytic domains are nicely organized and quite helpful. The wide spacing between the catalytic serine residues within the dimer unbound to DNA is not surprising, given prior examples.

In the Discussion section, there is a suggestion that the cleavage might occur in trans. Does this refer to cleavage within the transposon ends or cleavage of a transposon end and the target? Will the serine arrangement in the dimer seen in the structure be consistent with the latter mode of cleavage?

3) Figure 4 to Figure 8 (and figure supplements) and corresponding text. This section represents the heart of the paper, with extensive characterization of TnpA binding to the repeat elements of the left and right ends by EMSA, DNase I and exonuclease foot printing etc. The conclusion from the cumulative assays is the formation of the nucleation of a TnpA filament at the left end followed by capture of the right end to form the transposition synapse (paired end complex; PEC).

Figure 3 and supplements show the ability of left end to form PEC, as well as theinability of the right end to do so. The assay uses two separate DNA fragments, one as the radio-labeled probe for binding the second as an unlabeled fragment for capture into PEC. If two right ends are present on the same DNA fragment, do the ends come together in a PEC?

Are all the binding elements a-d present at the left end required for PEC formation with right end? If the left end is replaced by a copy of the right end, will transposition occur?

Will additional binding elements at the right end be inhibitory to PEC formation and transposition? Here, one would expect filament formation at both ends.

4) Continued from 3. According to the PEC model, the ends are bridged by a series of TnpA dimers initially lined up at the left end. Do the present results exclude the possibility that the PEC formation is mediated via dimers of TnpA dimers, bound at the left and the right ends?

Is it possible to think of the extra binding elements abutting those next to the scissile phosphates as accessory sites, by analogy to Tn3 resolvase? In such a scenario, the TnpA dimers bound to the accessory sites may allosterically activate the TnpA dimers that perform DNA cleavage/transfer. The positioning of the TnpA dimers within one of the two ends cannot possibly form a high-order topological arrangement. However, a functionally relevant inter-wrapping of DNA between the bound ends would seem plausible, and is rather fleetingly alluded to by the authors.

5) Target capture? While the authors have presented detailed arguments to highlight the importance of the PEC in initiating the chemistry of transposition, the potential means of target capture and the mechanism of cleavage at the insertion site is rather unclear. In the PEC-target complex, is there a TnpA dimer at the target site, or is target cleavage accomplished by TnpA associated with the PEC? Obviously, the experiments presented here do not address this question directly. However, it would be helpful to think about target capture/cleavage in light of existing transposition models.

Perhaps an additional figure expanding on the mechanistic implications of the binding model shown in Figure 8 in cleavage and strand transfer may be appropriate (even if speculative). For example, are the cut-out of the transposon, insertion into the target, and healing of the donor concerted events? Or, does transposon excision from the donor precede its integration into the target?

The authors seem to rule out DNA rotation following cleavage as part of the strand transfer event. Is this inference consistent with interface of the TnpA dimer in the structure? Can one ignore the possibility of rotation between cleaved transposon ends and target ends?

6) In summary, this is an important study that reveals a mode of transpososome assembly in the IS607 family that differs from the more conventional mechanisms of assembly that we are currently aware of. Many of the implications of the model proposed will be tested by in vitro reactions, which the authors are eminently capable of successfully performing.

https://doi.org/10.7554/eLife.39611.031

Author response

Reviewer 1's comments 1-3 are addressed below in the context of the Editor's summary comments 1, 3, and 4.

1) Both reviewers 1 and 3 were in complete accord with principal comments 1-3 as expressed by reviewer 1 (see related comments by reviewer 3).

3) Both reviewers 1 and 3 found the model for cooperative TnpA assembly at the left end followed by capture of the unbound right end somewhat unsubstantiated by the experimental data. Of course, this is a model at this stage. Both reviewers 1 and 3 were not sure that the present results rule out PEC formation by TnpA-bound left and right ends, with the pairing stabilizing the TnpA dimers bound to the right end. This should be addressed or the conclusions qualified.

4) Both reviewers 1 and 3 thought that the authors may want to restate their position on why they consider rotation unlikely rather than rule out this possibility. Although they agree with the authors that subunit rotation is unlikely, they nonetheless, agree that subunit rotation cannot be dismissed as a mechanism based on the data presented. Perhaps the authors could expand this discussion a bit by explaining their reasoning and noting that their data do not rule it out. In this context, the authors should consider: (1) The dimerization interface does not involve helix-E and therefore subunit rotation would have to occur via a different interface compared to superfamily members with C-terminal DBDs. (2) I don't see why anything needs to rotate when the transposon is excised from the host.

1) The most important comment, reiterated by the editor, concerns the chemical activity of the IS1535 TnpA complexes. This is an area of considerable ongoing effort for us. We have tried quite hard to obtain evidence of transposition of various M. tuberculosis IS1535 transposon derivatives in E. coli, including LE-tnpA-tet-RE and LE-tnpA-tet-LE constructs. Unfortunately, sequence analyses of the rare candidates have thus far failed to confirm any true transposition events. Also, despite considerable effort, we have not yet obtained evidence for chemical activity in vitro (comment 4 of reviewer 2). There could be a number of reasons for this. As elaborated in the Discussion, our current thinking is that catalytically-active TnpA in an alternate folded state must be recruited onto the PEC at the transposon-host junction. This step may require co-translational or chaperone-assisted folding onto the PEC. This would have to occur in the heterologous E. coli cells in our in vivo assays. It is also possible that a host factor required for IS1535 transposition is missing in E. coli or that IS1535 tnpA has become catalytically inactive over time, even though it retains the normal constellation of residues around the active site that are believed to be important for SR chemistry. For example, IS1535 TnpA it may not be able to fold into the chemically-active conformation.

2) Are the footprinting assays really probing the PEC? Comparisons of the PEC assembly reactions with increasing TnpA by EMSAs (Figure 3) and by DNase I protections or Exo III stops in the footprinting reactions (Figure 4) show a close correspondence. Nevertheless, we directly assayed complexes formed in the footprint reactions by EMSA (removed an aliquot for EMSA from the footprint binding reactions immediately before nuclease digestion) and provided an example in Figure 4 figure supplement 1 in the original version. Note that at the amount of TnpA where the DNA is near fully protected from DNase digestion over the core motifs (Figure 4A), almost all of the DNA is within a PEC by EMSA (Figure 4—figure supplement 1). We could provide a similar EMSA example for a footprint reaction on LE: v54-20v (Figure 6), but believe this would be redundant (a standard EMSA for this mutant is presented in Figure 5—figure supplement 1C). It is possible that footprint signals at low TnpA concentrations, particularly in the more sensitive Exo III assays, may arise from some single-end coated complexes that may not be stable to electrophoresis in the EMSAs.

3) The reviewers (reviewer 1 principal comment 2) would like us to better justify why we propose that a single-end complex captures an unbound end to generate a PEC. Specifically, reviewer 1 asks whether high TnpA over end DNA concentrations inhibit PEC formation, which would be a direct prediction of the model. This prediction is born out in Figure 3A, and we noted this on p. 8: "Appearance of complex 2 is accompanied by a [similar] decrease of PECs" and have added a sentence emphasizing this point in the Discussion section. To better illustrate this point we have substituted Figure 3B with a plot that quantifies the complete TnpA titration range, showing a decrease in the number of PEC at high TnpA concentrations with a corresponding increase in the amounts of complex 2 (proposed to be single-end bound complexes). An expanded view of the lower end of the titration range that highlights formation of the PEC and was given in the original Figure 3B is now an insert within the Figure 3B panel.

We note in the Results section and elaborate in the Discussion section, that there is no evidence for a kinetic intermediate in the formation of the PEC, although one could argue that two bound ends could rapidly associate into a PEC and preclude detection at low TnpA concentrations. We believe the structural arguments for a single-end complex capturing an unbound end elaborated in the Discussion section and illustrated in Figure 8 add additional support. These considerations, together with points discussed in our response to principal comments 5 and 6, lead us to believe that our proposed PEC assembly model whereby one TnpA-bound end captures a free end is the most parsimonious explanation for the current information, but a more structurally complex model of synapsis by collision of two bound sites cannot be ruled out.

2) Both reviewers 1 and 3 agree that the foot-printing data is extensive and thorough, and characterizes the interactions of TnpA dimers with individual binding elements within each transposon end, and the allosteric/cooperative contributions to binding. One concern is whether this binding characterization is sufficient to draw strong conclusions regarding paired end complex (PEC) formation (or synapsis of the ends), which is the principal question that the authors set out to solve.

The authors may want to restate their position on why they consider rotation unlikely rather than rule out this possibility. We emphasize that we do not "rule out" the possibility of subunit rotation. We write: "We suggest that the element may be excised from the donor site and then insert into a target site using serine chemistry without strand transfer coupled to subunit rotation." As the reviewers would no doubt agree, any discussion on the nature of the target capture and strand transfer steps is speculation at this point. We currently have no data regarding the structural nature of the chemically-active protein, the target capture step, and the chemistry of transposon end cleavage and strand transfer reactions. In the absence of any data-driven basis for discussion, we question the value of describing multiple hypothetical scenarios for these steps. Nevertheless, we have re-written the last paragraph to slightly expand the discussion on target capture and implications for strand exchange by a subunit rotation reaction.

5) Both reviewers 2 and 3 agree that the comments on Figure 8 by reviewer 2 models based on affinity differences in binding.

The Figure 8A complex must not be much higher affinity than the Figure 8B complex. We agree and emphasize throughout that complex 1 is only observed at high TnpA concentrations. We imagine that a TnpA dimer dynamically associates with DNA in solution in the configuration represented by Figure 8B, but this complex is not stable to gel electrophoresis. However, in the context of multiple helically-phase motifs, a single dimer as in Figure 8B recruits other like dimers to form the filament represented by Figure 8C, which, in the absence of free LE DNA, is complex 2 that is at least somewhat stable to gel electrophoresis. Complex 1, which we are proposing to be represented by Figure 8A, appears on gels at much higher TnpA concentrations than required for PEC formation, even though by computer this dimer conformation readily docks onto DNA, and one might predict would be relatively stable. Complex 1 does not appear to effectively compete with PEC formation on LEs by EMSAs or by Exo III (solution) assays. We believe that there are important kinetic features at play in the assembly of the PEC: the slow step may be the initial nucleation of the array of dimers on a single end. Once the array with its constellation of unsatisfied binding sites is formed, it rapidly associates with a naked end to form a stable PEC. We have added this to the Discussion section.

6) Both reviewers 2 and 3 agree thought that the role (or lack thereof) for helix E in PEC formation is worth consideration from the structure perspective.

Consider the role (or lack thereof) for helix E in PEC formation. We believe the Reviewers are thinking about a potential model by which synapsis of two pre-bound ends occurs by an association of (remodeled) helix E regions, which could be considered unlikely because of weak PEC formation by the helix E deletion mutant. We agree and have added this point at the end of the Results section and in the Discussion section. The experimental evidence disfavoring this possible mechanism for PEC assembly is in line with our favored model of a single bound end capturing a free end.

7) The question of the functional competence of identical sequences at left and right ends of the transposon placed in the proper orientation is a relevant question that all three reviewers raise.

What is the functional competence of identical sequences at left and right ends of the transposon? Although we have discussed making IS607 derivatives with two left or right ends to test the functional consequence of two identical ends on transposition in vivo, it has not been done. This is a good point, and we will now make testing this a priority. We note, however, that most IS607-family elements have multiple motifs variably distributed over relatively large segments at each end (e.g., Figure 1B) and that it is possible that the poor right end in IS1535 may be an anomaly, as hinted at in the Discussion section.

8) Finally, an additional topic that the authors may want to briefly discuss at the end is how their model might accommodate binding to target DNA. Presumably, another catalytic dimer equivalent needs to be provided. This would be speculative, but would relate the results of the paper back to the overall transposition pathway for readers.

Target DNA binding and strand transfer into target DNA. As discussed above in point 4, how target capture and strand transfer occur are important issues that, unfortunately, we currently have no information on. We have incorporated a brief discussion of these issues along with subunit rotation in the last paragraph of the Discussion section. The focus of this paper is, of course, on the earlier step of paired-end complex formation.

Reviewer #3:

1) The lack of target DNA duplication in the IS607 family is interesting from a purely transposition point of view, and based on precedent. The length of the duplication reflects the spacing of the nucleophiles (3'-hydroxyls in canonical transposition) and in turn the staggered insertion of the transposon ends at the target site. Depending on the extent of replication/repair, the nucleotides spanning the gap generated by the insertion, or these nucleotides plus the entire transposon, can be duplicated to give either a simple insertion or a co-integrate. Since the length of the duplication varies among different families, a few bp to as many as ~30 bp for CRISPRs, why not think of the IS607 family as a '0 bp' duplication family?

Why not think of the IS607 family as a '0 bp' duplication family? Although the functional consequence is indeed a "0 bp duplication," the serine chemistry mechanism that presumably generates double strand cuts and joints using two active sites (dimer) on each duplex DNA is different from conventional mechanisms. We wonder whether it is useful to lump the different catalytic strategies/families together.

2) In some sense, the differences between conservative site-specific recombination, DNA transposition and even homologous recombination are often less marked than they are purported to be, if considered from a purely mechanistic standpoint. Given the rather limited chemical repertoire available for biological catalysis, how many different ways can one break or form a phosphodiester bond in DNA or RNA-rather few. Topoisomerases, conservative site-specific recombinases, enzymes that carry out homologous recombination and DNA/RNA polymerases illustrate he limited number of ways that this chemistry can be performed under different biological contexts. Juggling these mechanisms to reach similar genetic outcomes, or using the same mechanism to bring about distinct genetic rearrangements, is expected to be the rule rather than the exception. The authors may wish to tone down their characterization of IS607 family as an outlier among transposons.

We agree that only a handful of catalytic mechanisms exist that become tailored to generate specific types of DNA rearrangements through their unique synaptic complex architecture – indeed, the SRs are a prime example. On the other hand, we believe it is useful to highlight the unusual features of IS607-family elements in comparison to other transposable elements.

3) Given that the transposase uses a serine nucleophile-based mechanism, the strand cleavage and the strand transfer steps of transposition must follow transesterification chemistry. Hence the reaction is more akin to site-specific recombination. One could think of the insertion of the transposon as recombination-mediated cassette exchange (RMCE), although a non-standard one. The donor cassette (transposon) is flanked by two recombination sites as in normal RMCE, but the target site has only one equivalent site, so the excised cassette in exchange for transposon insertion is a '0 bp cassette' (on par with 0 bp target duplication during transposition).

I believe that the 'cut-and-paste' mechanism that authors propose in the 'Discussion' is more or less the same as the non-reciprocal RMCE. An implication of this mechanism is that the donor DNA is likely to be repaired efficiently without addition or loss of DNA. Is this true from in vivo assays? It is probably known and mentioned here, and I might have somehow missed it.

We agree that recombination-mediated cassette exchange (RMCE) has features related to our current view of the IS607-family transposition pathway. All of the IS607 transposition products that we analyzed and that were reported by Kersulyte et al. are precise. On the other hand, there appear to be occasional interesting issues with joints among Sulfolobus ISC1926-like elements which may inform on the target insertion step. This needs more investigation and is beyond the scope of the present report on PEC formation.

Specific Comments:

1) Summary: Is the last sentence 'We posit-PEC recruits a chemically-active conformer of TnpA -.' justified by the data? Is it not possible that the TnpA dimer (tetramer?) adjacent to the scissile phopshates acquires cleavage competence within the assembled PEC? The sentence, as it stands now, suggests that there are inactive and active TnpA dimers in solution and the assembled PC selectively enlists active dimers to the cleavage site.

Our currently favored model as "posited” in the summary is that an assembled PEC recruits a chemically-active conformer of TnpA to be positioned over the cleavage site, but we have no hard data for this. As the reviewer suggests in comment 4, an alternative model that is also plausible is that the PEC allosterically activates TnpA proteins that may be weakly bound over the transposon-host junction. We have added this possibility in the relevant part of the Discussion section. Note that the footprint assays on the LE indicate TnpA binds over the junction very weakly: by DNase I protection binding is not always evident, and binding has not been detected by Exo III.

2) Figure 2 and Figure 3 (figure supplements) and text on structural data. The domain characterization of TnpA, the structural determination (C-terminal domain) and the relevant comparisons to serine recombinase catalytic domains are nicely organized and quite helpful. The wide spacing between the catalytic serine residues within the dimer unbound to DNA is not surprising, given prior examples.

In the Discussion section, there is a suggestion that the cleavage might occur in trans. Does this refer to cleavage within the transposon ends or cleavage of a transposon end and the target? Will the serine arrangement in the dimer seen in the structure be consistent with the latter mode of cleavage?

The suggestion in the Discussion section (noting the Boocock and Rice reference) that cleavage might occur in trans has been re-written to specify cleavage of the ends and/or the target. However, because the catalytic protomers must be in a very different structure than the solution and PEC-associated dimer, we really cannot firmly speak to this issue. In the Figure 8 models that are based on the solution dimer structures, the serines are >50 Å from the DNA to which they are bound.

3) Figure 4 to Figure 8 (and figure supplements) and corresponding text. This section represents the heart of the paper, with extensive characterization of TnpA binding to the repeat elements of the left and right ends by EMSA, DNase I and exonuclease foot printing etc. The conclusion from the cumulative assays is the formation of the nucleation of a TnpA filament at the left end followed by capture of the right end to form the transposition synapse (paired end complex; PEC).

Figure 3 and supplements show the ability of left end to form PEC, as well as theinability of the right end to do so. The assay uses two separate DNA fragments, one as the radio-labeled probe for binding the second as an unlabeled fragment for capture into PEC. If two right ends are present on the same DNA fragment, do the ends come together in a PEC?

Are all the binding elements a-d present at the left end required for PEC formation with right end? If the left end is replaced by a copy of the right end, will transposition occur?

Will additional binding elements at the right end be inhibitory to PEC formation and transposition? Here, one would expect filament formation at both ends.

To be clear, our in vitro assays are primarily analyzing the robust assembly of two left ends into a PEC. Two right ends will weakly synapse (Figure 3E) and a left end will weakly synapse with a right end (Figure 3F). We have not tested left end deletions for their ability to synapse with the right end, nor added motifs to the right end to improve its activity, and have not tested an IS1535 RE-RE construct for transposition.

4) Continued from 3. According to the PEC model, the ends are bridged by a series of TnpA dimers initially lined up at the left end. Do the present results exclude the possibility that the PEC formation is mediated via dimers of TnpA dimers, bound at the left and the right ends?

Is it possible to think of the extra binding elements abutting those next to the scissile phosphates as accessory sites, by analogy to Tn3 resolvase? In such a scenario, the TnpA dimers bound to the accessory sites may allosterically activate the TnpA dimers that perform DNA cleavage/transfer. The positioning of the TnpA dimers within one of the two ends cannot possibly form a high-order topological arrangement. However, a functionally relevant inter-wrapping of DNA between the bound ends would seem plausible, and is rather fleetingly alluded to by the authors.

See responses to principal comments 3 and 6, and comment 1 concerning the PEC assembly and allosteric activation models. We do mention that the PEC may be assembled into a more inter-wrapped structure than depicted in Figure 8E in the Discussion section, Figure 8E legend, and Figure 8—figure supplement 3 legend.

5) Target capture? While the authors have presented detailed arguments to highlight the importance of the PEC in initiating the chemistry of transposition, the potential means of target capture and the mechanism of cleavage at the insertion site is rather unclear. In the PEC-target complex, is there a TnpA dimer at the target site, or is target cleavage accomplished by TnpA associated with the PEC? Obviously, the experiments presented here do not address this question directly. However, it would be helpful to think about target capture/cleavage in light of existing transposition models.

Perhaps an additional figure expanding on the mechanistic implications of the binding model shown in Figure 8 in cleavage and strand transfer may be appropriate (even if speculative). For example, are the cut-out of the transposon, insertion into the target, and healing of the donor concerted events? Or, does transposon excision from the donor precede its integration into the target?

The authors seem to rule out DNA rotation following cleavage as part of the strand transfer event. Is this inference consistent with interface of the TnpA dimer in the structure? Can one ignore the possibility of rotation between cleaved transposon ends and target ends?

See responses to principal comments 4 and 8 concerning target capture and subunit rotation.

https://doi.org/10.7554/eLife.39611.032

Article and author information

Author details

  1. Wenyang Chen

    Department of Biological Chemistry, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, United States
    Contribution
    Conceptualization, Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3035-1496
  2. Sridhar Mandali

    Department of Biological Chemistry, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, United States
    Contribution
    Conceptualization, Investigation, Methodology
    Competing interests
    No competing interests declared
  3. Stephen P Hancock

    Department of Biological Chemistry, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, United States
    Present address
    Department of Chemistry, Towson University, Towson, United States
    Contribution
    Investigation, Visualization, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4205-7913
  4. Pramod Kumar

    Department of Biological Chemistry, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, United States
    Present address
    National Center for Cell Science, Maharashtra, India
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  5. Michael Collazo

    Department of Energy Institute of Genomics and Proteomics, University of California at Los Angeles, Los Angeles, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  6. Duilio Cascio

    Department of Energy Institute of Genomics and Proteomics, University of California at Los Angeles, Los Angeles, United States
    Contribution
    Resources, Supervision, Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  7. Reid C Johnson

    1. Department of Biological Chemistry, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, United States
    2. Molecular Biology Institute, University of California at Los Angeles, Los Angeles, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing—original draft, Project administration
    For correspondence
    rcjohnson@mednet.ucla.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5562-1934

Funding

National Institute of General Medical Sciences (GM038509)

  • Wenyang Chen
  • Sridhar Mandali
  • Stephen P Hancock
  • Pramod Kumar
  • Reid C Johnson

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We are grateful to Dennis Grogan (University of Cincinnati) for providing us with genomic DNA of S. islandicus pyrE::ISC1926 and David Eisenberg (UCLA) for genomic DNA of M. tuberculosis. We thank Rachel Ogorzalek Loo (UCLA), and Nuraly Avliyakulov and Michael Haykinson (UCLA), for performing mass spectrometry. The UCLA X-ray core facility is supported in part by the Department of Energy grant DE-FC0302ER63421. We thank the Northeastern Collaborative Access Team beamline NECAT ID-24 at the Advanced Photon Source of Argonne National Laboratory, which is supported by National Institutes of Health grants P41 RR015301, S10 RR029205, and P41 GM103403. Use of the Advanced Photon Source is supported by the Department of Energy under Contract DE-AC02-06CH11357. This work was supported by NIH grant GM038509 to RCJ.

Senior and Reviewing Editor

  1. Gisela Storz, National Institute of Child Health and Human Development, United States

Reviewing Editor

  1. Stephen C Kowalczykowski, University of California, Davis, United States

Publication history

  1. Received: June 27, 2018
  2. Accepted: September 18, 2018
  3. Version of Record published: October 5, 2018 (version 1)
  4. Version of Record updated: October 8, 2018 (version 2)

Copyright

© 2018, Chen et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 556
    Page views
  • 83
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Biochemistry and Chemical Biology
    2. Cell Biology
    Timo Vögtle et al.
    Research Article
    1. Biochemistry and Chemical Biology
    2. Structural Biology and Molecular Biophysics
    Kenta Yamamoto et al.
    Research Article