Genetic organization, signature sequence motifs, structural models, and phyletic distribution of McrB GTPases detected in this work
A) McrBC is a two-component restriction system with each component typically (except for extremely rare gene fusions) encoded by a separate gene expressed as a single operon, depicted here and in subsequent figures as arrows pointing in the direction of transcription. In most cases, McrB is the upstream gene in the operon. B) In the prototypical E.coli K-12 McrBC system, McrB contains an N-terminal methylcytosine-binding domain, ADAM/DUF3578, fused to a GTPase of the AAA+ ATPase clade (Sukackaite et al., 2012). This GTPase contains the Walker A and Walker B motifs that are conserved in P-loop NTPases as well as a signature NxxD motif, all of which are required for GTP hydrolysis (Nirwan et al., 2019, Niu et al., 2020, Pieper et al., 1999). McrC consists of a PD-DxK nuclease and an N-terminal DUF2357 domain, which comprises a helical bundle with a stalk-like extension that interacts with and activates individual McrB GTPases while they are assembled into hexamers (Niu et al., 2020, Nirwan et al., 2019). C) AlphaFold2 structural model of E. coli K-12 ADAM-McrB GTPase fusion protein monomer and separate X-ray diffraction and cryo-EM structures of the ADAM and GTPase domains (Niu et al., 2020, Sukackaite et al., 2012). D) AlphaFold2 structural model and cryo-EM structure of E. coli K-12 McrC monomer with DUF2357-PD-D/ExK architecture (Niu et al., 2020). The structures were visualized with ChimeraX (Pettersen et al., 2021). E) Phyletic distribution of McrB GTPase homologs detected in this work, each found in a genomic island with distinct domain composition, clustered to a similarity threshold of 0.9, then assigned a weight inversely proportional to the cluster size (see Methods).