Myoglobin primary structure reveals multiple convergent transitions to semi-aquatic life in the world's smallest mammalian divers

  1. Kai He  Is a corresponding author
  2. Triston G Eastman
  3. Hannah Czolacz
  4. Shuhao Li
  5. Akio Shinohara
  6. Shin-ichiro Kawada
  7. Mark S Springer
  8. Michael Berenbrink  Is a corresponding author
  9. Kevin L Campbell  Is a corresponding author
  1. Department of Biological Sciences, University of Manitoba, Canada
  2. Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, China
  3. State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, China
  4. Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, China
  5. Department of Evolution, Ecology and Behaviour, University of Liverpool, United Kingdom
  6. Department of Bio-resources, Division of Biotechnology, Frontier Science Research Center, University of Miyazaki, Japan
  7. Department of Zoology, Division of Vertebrates, National Museum of Nature and Science, Japan
  8. Department of Evolution, Ecology and Organismal Biology, University of California, Riverside, United States
5 figures, 1 table and 2 additional files

Figures

Museum specimen photos illustrating the four major ecomorphotypes within the order Eulipotyphla.

Representative terrestrial (shrew-like mole, Uropsilus soricipes; bottom right), semi-aquatic (Russian desman, Desmana moschata; right centre), strictly fossorial (Eastern mole, Scalopus aquaticus; left), and semi-fossorial (Chinese long-tailed mole, Scaptonyx fusicaudus; top centre) talpid mole species are given. Photo by Kai He.

Figure 2 with 4 supplements
Time calibrated Bayesian phylogenetic tree of Eulipotyphla based on a concatenated alignment of 23 nuclear genes (outgroups not shown).

The units of time are in millions of years (Ma). Branch lengths represent median ages. Node bars indicate the 95% confidence interval [CI] for each clade age. Unless specified, all relationships are highly supported. Relationships weakly supported in concatenation Bayesian and maximum likelihood (PP <0.97 and/or BS <80: #) as well as *BEAST and ASTRAL coalescent analyses (C-PP <0.97 and/or C-BS <80: *) are indicated. Note that an alternative position was recovered for Soricini in the ASTRAL tree (Node A; Figure 2—figure supplement 4), while both coalescent analyses (Node B; ASTRAL, *BEAST) favored Episoriculus monophyly (Figure 2—figure supplements 24). Colored bars at the tips of the tree denote terrestrial (green), semi-aquatic (blue), semi-fossorial (beige), and fossorial (brown) lifestyles in extant species.

Figure 2—figure supplement 1
A heatmap produced using ggtree, showing tree-of-life genes included for each of the 76 samples used for phylogenetic analyses.

The ultrametric tree is the BEAST concatenation gene tree and a blank block indicates the gene is missing in the final dataset.

Figure 2—figure supplement 2
The full RAxML species tree constructed from 71 eulipotyphlan specimens.

Bootstrap supports are given next to internal nodes.

Figure 2—figure supplement 3
The full ASTRAL-III coalescent species tree constructed from 71 eulipotyphlan specimens.

Bootstrap supports are given next to internal nodes.

Figure 2—figure supplement 4
The full *BEAST coalescent species tree constructed from 71 eulipotyphlan specimens.

Posterior probabilities are given next to internal nodes.

Figure 3 with 2 supplements
Three-dimensional structural models of myoglobin in (A) the last common ancestor of Eulipotyphla and (B) the semi-aquatic Russian desman (Desmana moschata) obtained by homology modelling using the SWISS-MODEL server (Waterhouse et al., 2018).

The structure of the last common ancestor of the group was modelled based on results of an amino acid sequence reconstruction (see text for details). Ancestral (left) and derived (right) states of charge-changing amino acid replacements are circled and indicated with positional number and one-letter amino acid code. Blue and red color indicate amino acids with positively (H, His; K, Lys; R, Arg) and negatively charged amino acid side chains (D, Asp; E, Glu), respectively. White double arrows indicate surface amino acid side chains involved in salt bridges that are affected by charge-changing substitutions. Text boxes indicate the reconstructed temporal order (top to bottom) of charge decreasing and charge increasing amino acid substitutions (red and blue font, respectively) in the Desmana lineage in one letter code from ancestral (left) to derived (right) separated by positional number. Note that charge neutral substitutions (e.g. G35N), are not given in the text boxes.

Figure 3—figure supplement 1
Structural model of myoglobin in (A) the last common ancestor of Eulipotyphla and (B) the semi-aquatic Russian desman (Desmana moschata).

Structures were obtained by homology modelling using the SWISS-MODEL server (Waterhouse et al., 2018) and the primary structures of myoglobin obtained by conceptual translation of the here determined nucleotide sequence (Russian desman) or by ancestral amino acid sequence reconstruction (see text for details). Structures were visualised in PyMOL version 2.1.1. Note that the gap at position 121 in the GH-loop of the tertiary structure (circled in red) of the Russian desman in (B) appears to exert negligible effect on the tertiary structure of the protein.

Figure 3—figure supplement 2
Location of charge-changing amino acid substitutions in the oxygen-storing protein myoglobin of four semi-aquatic species of moles and shrews in the mammalian insectivore order Eulipotyphla.

Three-dimensional structures were obtained by homology modelling using the SWISS-MODEL server (Waterhouse et al., 2018) and the amino acid sequences of (A) Sorex palustris, (B) Neomys fodiens, (C) Nectogale elegans, and (D) Condylura cristata. The three-dimensional myoglobin structure of the last common ancestor of the group was also modelled based on results of an amino acid sequence reconstruction (see text for details). Ancestral (left) and derived (right) states of charge-changing amino acid replacements are circled and indicated with positional number and one-letter amino acid code. Blue and red color indicate amino acids with positively (H, His; K, Lys; R, Arg) and negatively charged amino acid side chains (D, Asp; E, Glu). White double arrows indicate salt bridges that are affected by charge-changing substitutions. Image views between panels A-D have been rotated to maximally visualise lineage-specific replacements.

Figure 4 with 3 supplements
Relationship of modelled myoglobin net surface charge of eulipotyphlan mammals to lifestyle and relative electrophoretic mobility of native myoglobin proteins.

(A) Violin plot showing the distribution (y-axis) and probability density (x-axis) of modelled myoglobin net surface charge, ZMb, among living species (black dots) of the four prevalent eulipotyphlan ecomorphotypes. (B) Correlation between ZMb and electrophoretic mobility of native myoglobin from five eulipotyphlan insectivores; data from the grey seal (Halichoerus grypus) is added for comparison. ZMb was calculated as the sum of the charge of all ionisable groups at pH 6.5 by modelling myoglobin primary structures onto the tertiary structure and using published, conserved, site-specific ionisation constants (McLellan, 1984; Mirceta et al., 2013). Electrophoretic mobility was assessed relative to the mobility of grey seal myoglobin using native polyacrylamide gel electrophoresis of heart or skeletal muscle protein extracts of the indicated species. Green, orange, brown, and blue areas (A) or symbols and fonts (B) indicate terrestrial, semi-fossorial, fossorial, and semi-aquatic/aquatic species, respectively. Phylogenetic Generalised Least Squares analysis in panel (B) revealed a highly significant positive correlation (R2 = 0.897, p<0.005) between the two parameters (solid line, y = 0.1488 x+0.3075).

Figure 4—figure supplement 1
Time-calibrated tree of 55 eulipotyphlan species for which complete myoglobin coding sequences were determined (left).

Horizontal bars on the right indicate the calculated ZMb for each species, which are color coded according to species lifestyle.

Figure 4—figure supplement 2
Comparisons of the Bayesian concatenation species tree estimated using the tree-of-life genes with myoglobin RAxML gene trees estimated using nucleotide sequences.

Bootstrap supports are given next to internal nodes on the myoglobin gene trees. Only Bootstrap supports higher than 70 are shown.

Figure 4—figure supplement 3
Comparisons of the Bayesian concatenation species tree estimated using the tree-of-life genes with myoglobin RAxML gene trees estimated using amino-acid sequences.

Bootstrap supports are given next to internal nodes on the myoglobin gene trees. Only Bootstrap supports higher than 70 are shown.

Figure 5 with 3 supplements
Evolutionary reconstruction of myoglobin net surface charge ZMb in 55 eulipotyphlan insectivores mapped onto the time calibrated phylogeny of Figure 2.

Ancestral ZMb was modelled from primary structures as in Figure 3 and after maximum likelihood ancestral sequence reconstruction. Major charge increasing (blue font) and charge decreasing (red font) amino acid substitutions, from ancestral to derived and separated by positional number, inferred for the immediate ancestry of semi-aquatic species (blue font) are indicated in textboxes alongside the respective branches. Grey and white background shading indicates geologic epochs. See Figure 5—figure supplement 1A for a complete account of charge-changing substitutions, reconstructed ZMb values, and outgroup information. Paintings of representative species by Umi Matsushita.

Figure 5—figure supplement 1
Evolutionary reconstructions of myoglobin net surface charge (ZMb) in eulipotyphlan mammals.

Maximum likelihood ancestral (A) amino-acid-based and (B) codon-based sequence reconstruction and net surface charge (ZMb) calculation of eulipotyphlan myoglobin mapped onto the time calibrated phylogeny of Figure 2. Results of a separate amino-acid based reconstruction based on the *BEAST species tree reconstruction (Figure 2—figure supplement 3) is provided in (C) with Episoriculus fumidus as the species with an alternative position compared to A and B in red font. Only charge-changing substitutions are shown, with blue and red font in text boxes indicating charge increasing and charge decreasing substitutions, respectively. Absolute charge-changing substitutions > 1.0 are underlined, whereas those <0.10 are shown in brackets. Yellow highlighting indicates amino acids reconstructed with p<0.95 for which less likely but differentially charged amino acids have been reconstructed with p>0.05 (see text for details). Reconstructed net charges are shown at nodes, though have been omitted in some cases if they were identical to values on the preceding node, for clarity. Terminal species considered semi-aquatic are indicated by blue font, with ZMbvalues >+ 2.0 also given in blue font.

Figure 5—figure supplement 2
Ancestral reconstruction of semi-aquatic lifestyles (blue filled circles) within Eulipotyphla based on both the (A) RAxML concatenation gene tree and (B) the *BEAST species tree.

For each tree, reconstructions were constructed using either a maximum parsimony (left) or a threshold (right) model. Note that Episoriculus fumidus (red font in B) was supported at different positions on the two trees.

Figure 5—figure supplement 3
Threshold analyses of myoglobin net surface charge and lifestyle of 55 eulipotyphlan mammals.

Posterior density distribution of the correlation coefficient (r) between (A) semi-aquatic, (B) fully fossorial, and (C) digging (to include fossorial and semi-fossorial lifestyles) and ZMb as estimated using the threshBayes function (D-G) show the results based on the analyses of subsets of samples that included only: (D) terrestrial and semi-aquatic species, (E) terrestrial and fully fossorial species, (F) terrestrial and digging species, or (G) semi-aquatic and fully fossorial species. The solid red line indicates the grand mean, the green dashed lines indicate the mean of the 80% confidence intervals, and the grey dashed lines indicate the mean of the 95% confidence intervals.

Tables

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional information
Commercial assay or kitQiagen DNeasy Blood and Tissue KitQiagen69504
Peptide, recombinant proteinNEBNext dsDNA FragmentaseNew England BioLabsM0348
Commercial assay or kitNEBNext Fast DNA Library Prep Set for Ion Torrent kitNew England BioLabsE6270L
Sequence-based reagentNEXTflex DNA Barcodes for Ion TorrentBIOO ScientificNOVA-401004
Commercial assay or kitE-Gel EX Gel, 2%InvitrogenG402002
Peptide, recombinant proteinNEBNext High-Fidelity 2X PCR Master MixNew England BioLabsM0541L
chemical compound, drugSera-mag speedbeadsThermoFisher09-981-123
Commercial assay or kitmyBaits custom target capture kitArbor Biosciencespersonalized
Commercial assay or kitIon 318 Chip Kit v2 BCThermoFisher4488146
Software, algorithmTorrent SuiteThermoFisher https://github.com/iontorrent/TS copy archived at swh:1:rev:7591590843c967435ee093a3ffe9a2c6dea45ed8 Bridenbecker et al., 2020v4.0.2
Software, algorithmAlienTrimmerhttps://research.pasteur.fr/en/software/alientrimmer/RRID:SCR_011835v0.3.2
Software, algorithmSolexaQA++http://solexaqa.sourceforge.net/RRID:SCR_005421v3.1
Software, algorithmParDRehttps://sourceforge.net/projects/pardre/v2.25
Software, algorithmKarecthttps://github.com/aminallam/karect, copy archived at swh:1:rev:ba3ad54e5f8ccec5fa972333fcf441ac0c6c2be0 Allam, 2015
Software, algorithmAbysshttp://www.bcgsc.ca/platform/bioinfo/software/abyssRRID:SCR_010709v2.0
Software, algorithmMIRAhttp://sourceforge.net/p/mira-assembler/wiki/Home/RRID:SCR_010731v4.0
Software, algorithmSPAdeshttp://bioinf.spbau.ru/spades/RRID:SCR_000131v3.10
Software, algorithmGeneioushttp://www.geneious.com/RRID:SCR_010519R11
Software, algorithmFastQChttp://www.bioinformatics.babraham.ac.uk/projects/fastqc/RRID:SCR_014583v0.11.5
Software, algorithmTrimmomatichttp://www.usadellab.org/cms/index.php?page=trimmomaticRRID:SCR_011848v0.39
Software, algorithmPHYLUCEhttps://github.com/faircloth-lab/phyluce, copy archived at swh:1:rev:66ff432f95cb8430d23f6c66a7981d57e8e06902Faircloth et al., 2021v1.6.0
Software, algorithmMAFFThttp://mafft.cbrc.jp/alignment/server/RRID:SCR_011811v6.864
Software, algorithmFastTreehttp://www.microbesonline.org/fasttree/RRID:SCR_015501v2.1.5
Software, algorithmASTRAL IIIhttps://github.com/smirarab/ASTRAL copy archived at swh:1:rev:05a85064da2ace5236dba94907bb3c45f45f9597
Mirarab et al., 2021
v5.15.0
Software, algorithmRDPhttp://web.cbio.uct.ac.za/~darren/rdp.htmlRRID:SCR_018537v5.5
Software, algorithmRAXMLhttps://github.com/stamatak/standard-RAxML, copy archived at swh:1:rev:a33ff40640b4a76abd5ea3a9e2f57b7dd8d854f6 Stamatakis et al., 2018RRID:SCR_006086v8.2
Software, algorithmNewick utilitieshttp://cegg.unige.ch/newick_utilsv1.6
Software, algorithmTree Graph 2http://treegraph.bioinfweb.info/v2
Software, algorithmBEASTBEAST 2 https://www.beast2.orgRRID:SCR_010228v2.5
Software, algorithmMEGA-Xhttp://megasoftware.net/RRID:SCR_000667Version X
Software, algorithmPAMLhttp://abacus.gene.ucl.ac.uk/software/paml.htmlRRID:SCR_014932v4.8
Software, algorithmEasyCodeMLhttps://github.com/BioEasy/EasyCodeML, copy archived at swh:1:rev:744a2480e2071c85e044155d8699e87b46356eb9Chen, 2021v1.31
Software, algorithmFastMLhttps://swissmodel.expasy.org/RRID:SCR_000305v3.11
Software, algorithmPyMolSchrödinger, LLC (http://www.pymol.org)RRID:SCR_000305v2.1.1
Software, algorithmSWISS-MODEL serverhttps://swissmodel.expasy.org/RRID:SCR_018123
Software, algorithmRhttps://www.r-project.org/v3.6
Software, algorithmCAPERhttps://cran.r-project.org/web/packages/caper/index.htmlv1.0.1
Software, algorithmphytoolshttps://cran.r-project.org/web/packages/phytools/index.htmlRRID:SCR_015502v0.7
Software, algorithmcastorhttps://cran.r-project.org/web/packages/castor/index.htmlv1.6.7
Software, algorithmggtreehttps://bioconductor.org/packages/ggtree/RRID:SCR_018560v3.12

Additional files

Supplementary file 1

Supplementary information for myoglobin primary structure reveals multiple convergent transitions to semi-aquatic life in the world's smallest mammalian divers.

(a) Sample information of specimens used in this study. (b) Hybridisation capture results of tree-of-life gene segments from 61 eulipotyphlan DNA libraries. Numbers in each column represent total base pairs captured; NA: no data. (C) Result of likelihood-based Shimodaira–Hasegawa test to compare the best scoring RAxML concatenated gene tree and alternative evolutionary hypotheses. (d) Myoglobin amino acid alignment used for modeling myoglobin net surface charge (ZMb) and ancestral sequence reconstructions. Myoglobin helices A to H are highlighted in yellow, with amino acid positions and helical notations indicated above and below the graphic, respectively. Internal amino acid residue positions are shaded in light grey, while deleted residues are indicated by a dash mark. Strongly anionic residues (D [Asp] and E [Glu]) are shaded in red, with strongly (K [Lys] and R [Arg]) and weakly (H [His]) cationic residues shaded in dark and light green, respectively. (e) Charge increasing (blue font) and decreasing (red font) residue substitutions reconstructed for semi-aquatic eulipotyphlan branches. (f) Evolutionary models estimated using bModelTest in BEAST, and used for BEAST and *BEAST analyses. (g) RAxML best scoring gene trees used for ASTRAL-III coalescent analysis before and after collapsing 0% Shimodaira–Hasegawa (SH) scores in Newick format. (h) Calibrations used for estimating divergence times in the BEAST analyses. (i) The best scoring concatenation species trees estimated using BEAST and RAxML, and the best species coalescence trees estimated using ASTRAL-III and *BEAST, in Newick format. (j) Primers used to amplify and sequence the protein coding exons of myoglobin.

https://cdn.elifesciences.org/articles/66797/elife-66797-supp1-v2.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/66797/elife-66797-transrepform-v2.pdf

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Kai He
  2. Triston G Eastman
  3. Hannah Czolacz
  4. Shuhao Li
  5. Akio Shinohara
  6. Shin-ichiro Kawada
  7. Mark S Springer
  8. Michael Berenbrink
  9. Kevin L Campbell
(2021)
Myoglobin primary structure reveals multiple convergent transitions to semi-aquatic life in the world's smallest mammalian divers
eLife 10:e66797.
https://doi.org/10.7554/eLife.66797