Phylogenomic and mitogenomic data can accelerate inventorying of tropical beetles during the current biodiversity crisis

  1. Michal Motyka
  2. Dominik Kusy
  3. Matej Bocek
  4. Renata Bilkova
  5. Ladislav Bocak  Is a corresponding author
  1. Laboratory of Biodiversity and Molecular Evolution, Czech Advanced Technology Research Institute, Centre of Region Hana, Czech Republic
21 figures, 7 tables and 5 additional files

Figures

Distribution and appearance of metriorrhynchine net-winged beetles.

(A) Distribution of Metriorrhynchini with major sampled localities designated by red dots. The numbers of analyzed specimens from individual regions are shown for regions and subtribes. (B–D) – General appearance of Metriorrhynchini.

Topologies recovered by phylogenomic analyses.

(A) Phylogenetic relationships of Metriorrhynchinae based on the ML analyses of the concatenated amino-acid sequence data of supermatrix F-1490-AA-Bacoca-decisive. Unmarked branches are supported by 100/100 UFB/alrt; red circles depict lower phylogenetic branch support. (B) Phylogenetic relationships of Metriorrhynchini recovered by the coalescent phylogenetic analysis with ASTRAL when analysing the full set of gene trees (4109 gene trees inferred at the nucleotide level). Pie charts on branches show ASTRAL quartet support (quartet-based frequencies of alternative quadripartition topologies around a given internode). Outgroups taxa are not shown. (C) Results of FcLM analyses for selected phylogenetic hypotheses applied at the amino-acid sequence level (supermatrix F). (D) Alternative phylogenetic relationships of Metriorrhynchinae based on the ML analyses of the concatenated amino-acid sequence data of supermatrix A-4109-AA. Numbers depict phylogenetic branch support values based on 5000 ultrafast bootstrap replicates.

Topologies recovered by mitogenomic analyses.

(A) Relationships of Metriorrhynchini recovered by the constrained analysis of the pruned dataset at 2% distance. (The full resolution tree is shown in Source data 2 along with a tree recovered from the analysis of a complete dataset of 6429 terminals in Source data 1), asterisk designates a grade of Metriorrhynchina-like taxa found in a position in conflict with their morphology; (B) A chart of Robinson-Foulds distances among topologies inferred by repeated runs of the constrained and unconstrained analyses; (C) A comparison of the results obtained by two runs of the constrained analysis; (D) A comparison of trees inferred with/without the phylogenomic backbone; (E) A comparison of results obtained by two runs of the unconstrained analysis. The red lines designate terminals with conflicting positions in compared trees.

Identification of sexual dimorphism by large-scale biodiversity inventory.

(A) Relationships of lineages with modified ontogeny, the dated tree; (B, D) General appearance and head of Cautires apterus, a putative neotenic species; (C, E) ditto of the close relative with both sexes winged. Mimetic sexual dimorphism identified during diversity survey. (F) The dated tree, red colored terminal labels designate the individuals shown in G and H; (G) Dorsal view of individuals in copula; (H) Ditto, lateral view. Except of collecting individuals in copula, DNA-based assessment of relationships is the only option as the species are sexually dimorphic and no morphological traits indicate their conspecifity.

A sequence of applied methods from sampling to hypotheses.
Appendix 1—figure 1
Maximum likelihood trees from IQ-TREE amino acid analysis of dataset A –4109-AA.

(A) With partitioning by gene and (B) without partitioning. The depicted branch support values represent SH-aLRT and ultrafast bootstrap.

Appendix 1—figure 2
Maximum likelihood trees from IQ-TREE amino acid analysis.

(A) Analysis of the dataset A-4109-AA with optimization of the partitioning scheme and (B) analysis of the dataset D-3370-AA_Bacoca with partitioning by gene. The depicted branch support values represent SH-aLRT and ultrafast bootstrap.

Appendix 1—figure 3
Maximum likelihood trees from IQ-TREE amino acid analysis.

(A) analysis of the dataset A –F-1490-AA_Bacoca_decisive and (B)analysis of the dataset J-2129-AA_MARE. Both datasets were partitioned by gene. The depicted branch support values represent SH-aLRT and ultrafast bootstrap.

Appendix 1—figure 4
Maximum likelihood trees from IQ-TREE nucleotide analysis.

(A) Analysis of the dataset B-4109-NT and (B)analysis of the dataset E-4109-NT2 using only second codon positions. Both datasets were partitioned by gene. The depicted branch support values represent SH-aLRT and ultrafast bootstrap.

Appendix 1—figure 5
Maximum likelihood trees from IQ-TREE nucleotide analysis.

(A) analysis of the datasetC-4109-NT12 using codon positions 1 + 2 and (B) analysis of the dataset G-NT-1767_MaxSymTest. Both datasets were partitioned by gene. The depicted branch support values represent SH-aLRT and ultrafast bootstrap.

Appendix 1—figure 6
Maximum likelihood trees from IQ-TREE nucleotide analysis.

(A) analysis of the dataset H-NT-1645_MaxSymTestmarginal and (B) analysis of the dataset I-NT-3905_MaxSymTestInternal. Both datasets were partitioned by gene. The depicted branch support values represent SH-aLRT and ultrafast bootstrap.

Appendix 1—figure 7
Topologies recovered by Astral analyses.

(A) ASTRAL species trees with branch lengths in coalescent units as resulted from the analyses of individual IQ-TREE maximum likelihood gene trees of nucleotide dataset B-4109-NT and (B) ASTRAL species trees with branch lengths in coalescent units as resulted from the analyses of individual IQ-TREE maximum likelihood gene trees of amino acid dataset A-4199-AA. Numbers on nodes show local posterior probabilities (pp1). Quartet support for the alternative topologies (q1, q2, and q3), total number of induced quartet trees in the gene trees that support the alternative topologies (f1, f2, f3) and local posterior probabilities (pp1, pp2, pp3) are available at Dryad repository.

Appendix 1—figure 8
Phylogenomic topologies.

(A) Phylogenetic relationships of Metriorrhynchini, resulted from the summary coalescent phylogenetic analysis with ASTRAL, when analyzing the full set of gene trees (4109 gene trees) inferred at the nucleotide level. (B) doitto at amino acids level. Pie charts on branches show ASTRAL quartet supports q1,q2,q3 (quartet-based frequencies of alternative quadripartition topologies around a given internode).Tree topologies correspond with Appendix 1—figure 7.

Appendix 1—figure 9
Alistat heatmaps.

(A–D) AliStat rectangular heatmaps showing pairwise alignment completeness scores for all species included in the analyzed amino acid supermatrices. The abbreviations of datasets correspond to Appendix 1—table 4. Values closer to one indicate higher completeness scores for the pairwise sequence comparisons.

Appendix 1—figure 10
AliStat heatmaps.

(A–F) AliStat rectangular heatmaps showing pairwise alignment completeness scores for all species included in the analyzed nucleotide supermatrices. The abbreviations of datasets correspond to Appendix 1—table 4. Values closer to one indicate higher completeness scores for the pairwise sequence comparisons.

Appendix 1—figure 11
SymTest analyses.

(A–D) Rectangular heatmap calculated with SymTest showing p-values for the pairwise Bowker’s tests in the analyzed amino acid supermatrices. Darker boxes indicate lower p-values and thus larger deviation from evolution under SRH conditions.

Appendix 1—figure 12
SymTest analyses.

(A–F) Rectangular heat map calculated with SymTest showing p-values for the pairwise Bowker’s tests in the analyzed nucleotide supermatrices. Darker boxes indicate lower p-values and thus larger deviation from the evolution under SRH conditions.

Appendix 1—figure 13
Results of FcLM analyses testing alternative phylogenetic hypotheses about placement of procautirine and leptotrichaline clades applied for various supermatrices.

The first column shows the results of FcLM when the original data were analyzed. The second column shows the results of FcLM after the phylogenetic signal had been eliminated from data. The third column show the results of FcLM after elimination of the phylogenetic signal and inhomogeneous amino-acid or nucleotide composition. The fourth column show the results of FcLM after the elimination of phylogenetic signal, inhomogeneous amino-acid or nucleotide composition and with randomized data coverage within all meta-partitions (see Supplementary methods for details).

Appendix 1—figure 14
Tribal and subtribal mOTUs delimitation using CD-hit-est.

Axis X represent the number of operational taxonomic units, whereas axis Y shows the delimitation threshold.

Appendix 1—figure 15
Distribution of selected Metriorrhynchini genera (originally assumed and revised).
Appendix 1—figure 16
Comparison of the Robinson-Fould (RF) distances among tree searches with constrained topology and tree searches with unconstrained topology.

Green values represent top 10% RF- distances, and red values shows bottom 10% RF- distances. The blue part of the graph represents constrained trees values, the grey part of the graph represents unconstrained trees values, the red part of the graph represents constrained/unconstrained trees values.

Tables

Table 1
The numbers of sampled localities per region.
AreaLocalitiesAreaLocalities
Australian region298Sino-Jap. region79
Australia118China51
New Guinea & Solomons179Japan28
New Zealand1
Wallacea49
Moluccas15Oriental region206
Sulawesi34S.India & Ceylon3
E.India & Burma12
Afrotropical Region64E.Indo-Burma44
West Africa1Malay Peninsula57
Guinean Gulf11Sumatra23
Ethiopia6Java & Bali15
East Africa10Philippines33
South Africa25
Madagascar11Total696
Table 2
The numbers of described species and identified mOTUs (molecular operational taxonomic units) at 2% and 5% thresholds per region and the total number of OTUs identified for subtribes.

Based on morphological identification, the OTUs of the kassemiine and other deeply rooted clades are included in Metriorrhynchina.

RegionMetriorrhynchina described/analyzed at 2%/5%Cautirina described/ analyzed at 2%/5%Metanoeina described/ analyzed at 2%/5%Metriorrhynchini described/ analyzed at 2%/5%RatioAnalyzed/described
Australian region639/1608/1239639/1608/12392.52–1.93
Australia196/167/133196/167/1310.85–0.67
New Guinea423/1434/1105423/1434/11053.39–2.61
Solomon Isl.21/9/921/9/90.43
Wallacea162/174/16214/10/9176/184/1711.05–0.97
Philippines51/18/1845/12/128/3/3104/33/330.32
Continental Asia43/52/42331/330/25730/34/31404/416/3301.03–0.82
Sundaland36/44/39201/184/14624/19/17261/247/2020.95–0.77
Indo-Burma6/7/762/52/423/4/474/63/530.85–0.72
China, Japan1/1/153/75/581/11/1155/87/701.58–1.27
India35/19/182/0/037/19/180.51–0.49
Afrotropical region231/104/94231/104/940.46–0.41
Sub-Saharan Africa178/74/65178/74/650.42–0.37
Madagascar53/30/2953/30/290.57
Total number of OTUs895/1852/1445641/456/36938/37/341574/2345/18481.50–1.17
Appendix 1—table 1
Detailed information on regional sampling.
MetriorrhynchinaCautirinaMetanoeinaMetriorrhynchini
Region# of specimens# of specimens# of specimens# of specimens
Australian region44894489
Australian continent39643964
Australia475475
New South Wales6464
Northern Territory1818
Queensland382382
South Australia11
Western Australia1010
New Zealand11
New Guinea34613461
Solomon Islands1919
Aru Islands88
Wallacea52515540
Maluku-Buru1515
Halmahera169169
Sulawesi34115
Oriental region2781,3111131,692
Philippines48371499
Mindanao113923
Negros2
Palawan2934568
Sibuyan44
Luzon22
Continental Asia2301264991593
Malaya342708348
Java2145672
Bali213
Sumatra5620126283
Borneo8225322357
Cambodia71320
Indo-Burma2314510178
China incl. Taiwan510023128
India5959
Japan1774181
Afrotropical region233233
Africa175175
Madagascar5858
Total476715491136429
Appendix 1—table 2
List of material for phylogenomic analyses.
Ingroup
VoucherSpeciesCladeGeographic originData sourceData typeTissue type
R18010Metriorrhynchus s. l.MetriorrhynchinaPorrostomineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18013Metriorrhynchus sp.MetriorrhynchinaPorrostomineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R20001Metriorrhynchus philippinensisMetriorrhynchinaPorrostominePhilippinesThis studyILLUMINA RNA-seq PE-readsWhole animal
R18018Metriorrhynchus s. l.MetriorrhynchinaPorrostomineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18020Metriorrhynchus s. l.MetriorrhynchinaPorrostomineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
1_kitePorrostoma rhipidiumMetriorrhynchinaPorrostomineAustraliaMcKenna et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
R18001Cladophorus sp.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18009Cladophorus sp.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18007Pseudodontocerus sp.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18012Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18025Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18028Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18003Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18016Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18005Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18015Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18006Ditua s. l.MetriorrhynchinaCladophorineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18002Microtrichalus sp.MetriorrhynchinaTrichalineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18024Eniclases sp.MetriorrhynchinaTrichalineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18022Diatrichalus sp.MetriorrhynchinaTrichalineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18026Trichaline sp.MetriorrhynchinaTrichalineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
JB0085Trichaline sp.MetriorrhynchinaTrichalineNew GuineaThis studyILLUMINA WGS-seq PE-readsThoracic muscles
R18004Procautires sp.MetriorrhynchinaProcautirineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18017Procautires sp.MetriorrhynchinaProcautirineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18021Procautires sp.MetriorrhynchinaProcautirineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18023Procautires sp.MetriorrhynchinaProcautirineNew GuineaThis studyILLUMINA RNA-seq PE-readsWhole animal
R18037Broxylus sp.MetriorrhynchinaLeptotrichalineSulawesiThis studyILLUMINA RNA-seq PE-readsWhole animal
R18030Leptotrichalus sp.MetriorrhynchinaLeptotrichalineSulawesiThis studyILLUMINA RNA-seq PE-readsWhole animal
R18034Wakarumbia sp.MetriorrhynchinaLeptotrichalineSulawesiThis studyILLUMINA RNA-seq PE-readsWhole animal
R18036Sulabanus sp.MetriorrhynchinaLeptotrichalineSulawesiThis studyILLUMINA RNA-seq PE-readsWhole animal
R18041Sulabanus sp.MetriorrhynchinaLeptotrichalineSulawesiThis studyILLUMINA RNA-seq PE-readsWhole animal
R18040Mangkutanus sp.MetriorrhynchinaLeptotrichalineSulawesiKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
R18039Xylobanus sp.CautirinaSulawesiThis studyILLUMINA RNA-seq PE-readsWhole animal
AJ0013Cautires communisCautirinaMalaysiaKusy et al., 2019ILLUMINA WGS-seq PE-readsThoracic muscles
G19002Metanoeus sp.MetanoeinaMalaysiaThis studyILLUMINA WGS-seq PE-readsThoracic muscles
Outgroup
Dilophotes sp.MalaysiaKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
Dihammatus sp.MalaysiaKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
Lycoprogentes sp.MalaysiaKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
Libnetis sp.MalaysiaKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
Platerodrilus sp.MalaysiaKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
Lyropaeus optabilisMalaysiaKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
Antennolycus constrictusMalaysiaKusy et al., 2019ILLUMINA RNA-seq PE-readsWhole animal
Appendix 1—table 3
Overview of official gene sets of six reference species used for ortholog assessment.

Number of genes correspond with OrthoDB 9.1.

SpeciesAccession# of contigsSourceDownload dateReference
Onthophagus taurusPRJNA16747817,483i5K05.03.20171
Tribolium castaneumPRJNA1254016,631iBeetle05.03.20172, 3
Dendroctonus ponderosaePRJNA36027013,088ENS Metazoa05.03.20174
Anoplophora glabripennisPRJNA16747922,035i5K05.03.20175
Leptinotarsa decemlineataPRJNA17174924,671i5K05.03.20171
Agrilus planipennisPRJNA23092115,497i5K05.03.20171
Appendix 1—table 4
Detailed information and statistics of each generated dataset.
Dataset nameNumber of taxaNumber of partitionsNumber of alignment sitesCompleteness score (Ca) AliStatPercentage of pairwise P-values < 0.05 for the Bowker’s test(SV) MARE Matrix saturation(IC) MARE Information contentPartition scheme optimalization
Amino acids
A-4109-AA424,10918926910.77212798.61%0.8880.506Yes
D-3370-AA_Bacoca423,37016723620.78036397.91%0.8800.511
F-1490-AA_Bacoca_decisive421,490673,1020.92508294.08%10.594
J-2129-AA_MARE422,129959,7410.87633190.82%0.9640.648
Nucleoides
B-4109-NT424,10956780730.772133100%NANA
C-4109-NT12424,10937853820.77213399.77%NANA
E-4109-NT2424,1091892691'0.77213492.45%NANA
G-NT-1767_MaxSymTest421,7672413164'0.71391299.88%NANA
H-NT-1645_MaxSymTestmarginal421,6452233485'0.71220899.77%NANA
I-NT-3905_MaxSymTestInternal423,9055377449'0.771645100%NANA
Appendix 1—table 5
Characteristics of concatenated super-matrices (mitochondrial fragments) and used models of the DNA evolution (descriptions of columns abbreviation listed bellow tables).

Abbreviations. Seq – Number of sequences, Site – Number of bases, Infor – Number of parsimony-informative sites, Invar – Number of invariant sites, Model – Best-fit model according to BIC using ModelFinder.

SubsetSeqsSitesInforInvarModel
Complete dataset
nad53,20598780988GTR + F + I + G4
cox5,9351,068812134GTR + F + I + G4
16 s2,381875577201GTR + F + I + G4
98% OTUs delimitation dataset
nad51,252969776115GTR + F + I + G4
tRNA Leu2,245623417GTR + F + G4
cox12,395783536141GTR + F + I + G4
cox22,19221619411GTR + F + I + G4
16 s1,115834512237GTR + F + I + G4

Additional files

Transparent reporting form
https://cdn.elifesciences.org/articles/71895/elife-71895-transrepform1-v2.docx
Supplementary file 1

Maximum likelihood tree recovered by the analysis of the full dataset (mitochondrial fragments); numbers above branches represent SH-alrt support values.

https://cdn.elifesciences.org/articles/71895/elife-71895-supp1-v2.docx
Source data 1

Maximum likelihood tree recovered by the analysis of the full dataset (mitochondrial fragments).

Depicted numbers above branches represent SH-alrt support values.

https://cdn.elifesciences.org/articles/71895/elife-71895-supp2-v2.pdf
Source data 2

Maximum likelihood tree recovered by the analysis of the reduced dataset (98% similarity OTUs).

Depicted numbers above branches represent SH-aLRT support values. The taxa with the constrained position in the tree are marked with the red star.

https://cdn.elifesciences.org/articles/71895/elife-71895-supp3-v2.pdf
Source data 3

Maximum likelihood tree recovered by the analysis of the reduced dataset (mitochondrial fragments, 98% similarity OTUs) with unconstrained backbone.

Depicted numbers above branches represent SH-alrt support values.

https://cdn.elifesciences.org/articles/71895/elife-71895-supp4-v2.pdf

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Michal Motyka
  2. Dominik Kusy
  3. Matej Bocek
  4. Renata Bilkova
  5. Ladislav Bocak
(2021)
Phylogenomic and mitogenomic data can accelerate inventorying of tropical beetles during the current biodiversity crisis
eLife 10:e71895.
https://doi.org/10.7554/eLife.71895