An updated phylogeny of the Alphaproteobacteria reveals that the parasitic Rickettsiales and Holosporales have independent origins

  1. Sergio A Muñoz-Gómez
  2. Sebastian Hess
  3. Gertraud Burger
  4. B Franz Lang
  5. Edward Susko
  6. Claudio H Slamovits
  7. Andrew J Roger  Is a corresponding author
  1. Dalhousie University, Canada
  2. University of Cologne, Germany
  3. Université de Montréal, Canada
3 figures, 2 tables and 3 additional files

Figures

Compositional heterogeneity in the Alphaproteobacteria is a major factor that confounds phylogenetic inference.

There are great disparities in the genome G + C% content and amino acid compositions of the Rickettsiales, Pelagibacterales (including alphaproteobacterium HIMB59) and Holosporales with all other …

https://doi.org/10.7554/eLife.42535.004
Figure 2 with 7 supplements
Decreasing compositional heterogeneity by removing compositionally biased sites disrupts the clustering of the Rickettsiales, Pelagibacterales (including alphaprotobacterium HIMB59) and Holosporales.

All branch support values are 100% SH-aLRT and 100% UFBoot unless annotated. (A) A maximum-likelihood tree inferred under the LG + PMSF(ES60)+F + R6 model and from the untreated dataset which is …

https://doi.org/10.7554/eLife.42535.005
Figure 2—figure supplement 1
A labeled version showing taxon names for Figure 2.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated.

https://doi.org/10.7554/eLife.42535.006
Figure 2—figure supplement 2
A diagram of the strategies and phylogenetic analyses employed in this study.
https://doi.org/10.7554/eLife.42535.007
Figure 2—figure supplement 3
Bayesian consensus trees inferred with PhyloBayes MPI v1.7 and the CAT-Poisson+Γ4 model.

Branch support values are 1.0 posterior probabilities unless annotated. (A) Bayesian consensus tree inferred from the full dataset which is highly compositionally heterogeneous. (B) Bayesian …

https://doi.org/10.7554/eLife.42535.008
Figure 2—figure supplement 4
Maximum-likelihood trees to assess the placements of the Holosporales, Rickettsiales, Pelagibacterales and alphaproteobacterium HIMB59 when all four groups are included.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated. (A) A tree that results from the analysis of the untreated dataset. (B) A tree that results from the analysis of a dataset …

https://doi.org/10.7554/eLife.42535.009
Figure 2—figure supplement 5
Maximum-likelihood tree from the untreated dataset from which no taxon has been removed and analyzed under simpler LG4X model.

In this tree, derived from an analysis using a model that does not account for compositional heterogeneity across sites, the Geminicoccaceae has a more derived placements within the Rhodospirillales

https://doi.org/10.7554/eLife.42535.010
Figure 2—figure supplement 6
Constraint tree, used for IQ-TREE analyses, labeled with taxon names and also degree of missing data per taxon.

Magnetococcales in gray; Rickettsiales in brown; Pelagibacterales in maroon; Holosporales in light blue; Rhizobiales in green; Caulobacterales in orange; Rhodobacterales in red; Sneathiellales in …

https://doi.org/10.7554/eLife.42535.011
Figure 2—figure supplement 7
GARP:FIMNKY ratios across the proteomes of the 120 alphaproteobacteria and outgroup used in this study.
https://doi.org/10.7554/eLife.42535.012
Figure 3 with 8 supplements
The Holosporales (renamed and lowered in rank to the Holosporaceae family here) branches in a derived position within the Rhodospirillales when compositional heterogeneity is reduced and the long-branched and compositionally biased Rickettsiales, Pelagibacterales, and alphaproteobacterium HIMB59 are removed.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated. (A) A maximum-likelihood tree, inferred under the LG + PMSF(ES60)+F + R6 model, to place the Holosporaceae in the absence of …

https://doi.org/10.7554/eLife.42535.013
Figure 3—figure supplement 1
Maximum-likelihood trees to assess the placement of the Holosporales in the absence of the Rickettsiales, Pelagibacterales and alphaproteobacterium HIMB59.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated. (A) A tree that results from the analysis of the untreated dataset. () A tree that results from the analysis of a dataset …

https://doi.org/10.7554/eLife.42535.014
Figure 3—figure supplement 2
Maximum-likelihood trees to assess the placement of the Rickettsiales in the absence of the Holosporales, Pelagibacterales, and alphaproteobacterium HIMB59.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated. (A) A tree that results from the analysis of the untreated dataset. (B) A tree that results from the analysis of a dataset …

https://doi.org/10.7554/eLife.42535.015
Figure 3—figure supplement 3
Maximum-likelihood trees to assess the placement of the Rickettsiales in the absence of the Holosporales, Pelagibacterales, alphaproteobacterium HIMB59 and the Beta-, and Gammaproteobacteria outgroup.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated. (A) A tree that results from the analysis of the untreated dataset. (B) A tree that results from the analysis of a dataset …

https://doi.org/10.7554/eLife.42535.016
Figure 3—figure supplement 4
Maximum-likelihood trees to assess the placement of the Pelagibacterales in the absence of the Holosporales, Rickettsiales and alphaproteobacterium HIMB59.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated. (A) A tree that results from the analysis of the untreated dataset. (B) A tree that results from the analysis of a dataset …

https://doi.org/10.7554/eLife.42535.017
Figure 3—figure supplement 5
Maximum-likelihood trees to assess the placement of alphaproteobacterium HIMB59 in the absence of the Holosporales, Rickettsiales and Pelagibacterales.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated.( A) A tree that results from the analysis of the untreated dataset. (B) A tree that results from the analysis of a dataset …

https://doi.org/10.7554/eLife.42535.018
Figure 3—figure supplement 6
Bayesian consensus trees inferred with PhyloBayes MPI v1.7 and the CAT-Poisson+Γ4 model.

Branch support values are 1.0 posterior probabilities unless annotated. (A) Bayesian consensus tree inferred to place the Holosporales in the absence of the Rickettsiales and the Pelagibacterales

https://doi.org/10.7554/eLife.42535.019
Figure 3—figure supplement 7
Maximum-likelihood trees to assess the placement of the Holosporales when the fast-evolving Holospora and ‘Candidatus Hepatobacter’ are also included in the absence of the Rickettsiales, Pelagibacterales and alphaproteobacterium HIMB59.

Branch support values are 100% SH-aLRT and 100% UFBoot unless annotated.

https://doi.org/10.7554/eLife.42535.020
Figure 3—figure supplement 8
Bayesian consensus tree inferred to place the Holosporales in the absence of the Pelagibacterales, alphaproteobacterium HIMB59, and Rickettsiales, and when the data have been recoded into a six-character state alphabet (the dataset-specific recoding scheme S6: AQEHISV RKMT PY DCLF NG W) to reduce compositional heterogeneity.

Branch support values are 1.0 posterior probabilities unless annotated.

https://doi.org/10.7554/eLife.42535.021

Tables

Table 1
Genome features for the three novel rickettsialeans sequenced in this study.

See Supplementary file 1 as well.

https://doi.org/10.7554/eLife.42535.003
Species‘Candidatus Finniella inopinata’Stachyamoeba-associated rickettsialeanPeranema-associated rickettsialean
Genome size1,792,168 bp1,738,386 bp1,375,759 bp
N50174,737 bp1,738,386 bp28,559 bp
Contig number281125
Gene number174115881223
A + T% content56.58%67.01%59.13%
Family'Candidatus Paracaedibacteraeae'Rickettsiaceae‘Candidatus Midichloriaceae’
OrderHolosporalesRickettsialesRickettsiales
Completeness94.96%97.12% (=100%)92.08%
Redundancy0.0%0.0%2.1%
  1. as predicted by Prokka v.1.13 (rRNA genes were searched with BLAST).

    as estimated by Anvi’o v.2.4.0 using the Campbell et al., 2013 marker gene set.

Table 2
A higher-level classification scheme for the Alphaproteobacteria and the Magnetococcia classes within the Proteobacteria, and the Rickettsiales and Rhodospirillales orders within the Alphaproteobacteria.
https://doi.org/10.7554/eLife.42535.022
Class 1. Alphaproteobacteria Garrity et al., 2005
             Subclass 1. Rickettsidae Ferla et al., 2013 emend. Muñoz-Gómez et al. 2019 (this work)
                             Order 1. Rickettsiales Gieszczkiewicz, 1939 emend. Dumler et al., 2001
                                          Family 1. Anaplasmataceae Philip, 1957
                                          Family 2. 'Candidatus Midichloriaceae' Montagna et al., 2013
                                          Family 3. Rickettsiaceae Pinkerton, 1936
            Subclass 2. Caulobacteridae Ferla et al., 2013 emend. Muñoz-Gómez et al. 2019
                             Order 1. Rhodospirillales Pfennig and Trüper, 1971 emend. Muñoz-Gómez et al. 2019
                                          Family 1. Acetobacteraceae (ex Henrici 1939) Gillis and De Ley, 1980
                                          Family 2. Rhodospirillaceae Pfennig and Trüper, 1971 emend. Muñoz-Gómez et al. 2019
                                          Family 3. Azospirillaceae fam. nov. Muñoz-Gómez et al. 2019
                                          Family 4. Holosporaceae Szokoli et al., 2016
                                          Family 5. Rhodovibriaceae fam. nov. Muñoz-Gómez et al. 2019
                                          Family 6. Geminicoccaceae Proença et al., 2018
                             Order 2. Sneathiellales Kurahashi et al., 2008
                             Order 3. Sphingomonadales Yabuuchi and Kosako, 2005
                             Order 4. Pelagibacterales Grote et al., 2012
                             Order 5. Rhodobacterales Garrity et al., 2005
                             Order 6. Caulobacterales Henrici and Johnson, 1935
                             Order 7. Rhizobiales Kuykendall, 2005
Class 2. Magnetococcia Parks et al., 2018
                             Order 1. Magnetococcales Bazylinski et al., 2013

Additional files

Supplementary file 1

A 16S rRNA gene maximum-likelihood tree of the Rickettsiales and Holosporales that phylogenetically places the three endosymbionts whose genomes were sequenced in this study.

(1) ‘Candidatus Finniella inopinata’ endosymbiont of Viridiraptor invadens strain Virl02, (2) an alphaproteobacterium associated with Peranema trichophorum strain CCAP 1260/1B, and (3) an alphaproteobacterium associated with Stachyamoeba lipophora strain ATCC 50324. Branch support values are SH-aLRT and UFBoot.

https://doi.org/10.7554/eLife.42535.023
Supplementary file 2

Supplementary tables.

(A) Ultrafast bootstrap (UFBoot) variation for several clades discussed in this study as compositionally biased sites, according to ɀ, are progressively removed in steps of 10%. (B) Ultrafast bootstrap (UFBoot) variation for several clades discussed in this study as the fastest sites are progressively removed in steps of 10%. (C) GenBank assembly accession numbers for the 120 alphaproteobacterial and outgroup genomes used in this study. (D) A list of the least compositionally heterogeneous genes out of the 200 single-copy and vertically inherited genes used in this study. (E) Model fit of amino acid replacement matrices as components of simple models that do not account for compositional heterogeneity across sites. Models are ordered from lowest to highest BIC. -LnL: log-likelihood; df: degrees of freedom or number of free parameters; AIC: Akaike information criterion; AICc: corrected Akaike information criterion; BIC: Bayesian information criterion. (F) Model fit of amino acid replacement matrices as components of complex models that account for compositional heterogeneity across sites. Models are ordered from lowest to highest BIC. -LnL: log-likelihood; df: degrees of freedom or number of free parameters; AIC: Akaike information criterion; AICc: corrected Akaike information criterion; BIC: Bayesian information criterion. (G) Model fit of LG + ES60+F for which the model component that accounts for rate heterogeneity across sites varies. Models are ordered from lowest to highest BIC. -LnL: log-likelihood; df: degrees of freedom or number of free parameters; AIC: Akaike information criterion; AICc: corrected Akaike information criterion; BIC: Bayesian information criterion. (H) Several summary statistics for the PhyloBayes MCMC chains run for each analysis under the CAT-Poisson+Γ4.

https://doi.org/10.7554/eLife.42535.024
Transparent reporting form
https://doi.org/10.7554/eLife.42535.025

Download links