Somatic mutation rates scale with time not growth rate in long-lived tropical trees

  1. Akiko Satake  Is a corresponding author
  2. Ryosuke Imai
  3. Takeshi Fujino
  4. Sou Tomimoto
  5. Kayoko Ohta
  6. Mohammad Na'iem
  7. Sapto Indrioko
  8. Widiyatno Widiyatno
  9. Susilo Purnomo
  10. Almudena Molla Morales
  11. Viktoria Nizhynska
  12. Naoki Tani
  13. Yoshihisa Suyama
  14. Eriko Sasaki
  15. Masahiro Kasahara
  1. Department of Biology, Faculty of Science, Kyushu University, Japan
  2. Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Japan
  3. Faculty of Forestry, Universitas Gadjah Mada, Indonesia
  4. PT. Sari Bumi Kusuma, Indonesia
  5. Gregor Mendel Institute of Molecular Plant Biology, Austrian Academy of Sciences, Austria
  6. Forestry Division, Japan International Research Center for Agricultural Sciences, Japan
  7. Faculty of Life and Environmental Sciences, University of Tsukuba, Japan
  8. Field Science Center, Graduate School of Agricultural Science, Tohoku University, Japan
4 figures, 1 table and 2 additional files

Figures

Figure 1 with 3 supplements
Physical tree structures and phylogenetic trees constructed from somatic mutations.

(a) Comparisons of physical tree structures (left, branch length in meters) and neighbor-joining (NJ) trees (right, branch length in the number of nucleotide substitutions) in two tropical tree species: S. laevis, a slow-growing species (S1 and S2), and S. leprosula, a fast-growing species (F1 and F2). IDs are assigned to each sample from which genome sequencing data were generated. Vertical lines represent tree heights. (b) Distribution of somatic mutations within tree architecture. A white and gray panel indicates the presence (gray) and absence (white) of somatic mutation in each of eight samples compared to the genotype of sample 0. Sample IDs are the same between panels (a) and (b). The distribution pattern of somatic mutations is categorized as Single, Double, and More depending on the number of samples possessing the focal somatic mutations. Among 27–1 possible distribution patterns, the patterns observed in at least one of the four individuals are shown.

Figure 1—figure supplement 1
Target tropical trees and location of study site.

(a) Images of S. laevis (S1), a slow-growing species, and S. leprosula (F1), a fast-growing species. (b), Location of the study site in central Borneo, Indonesia.

Figure 1—figure supplement 2
Workflow for identifying de novo somatic SNVs.

Eight samples (seven leaves and one cambium) were collected from four trees (two trees from each species). DNA was extracted twice independently from each sample and sequenced independently. Reads were mapped to the reference genome and used for SNV calling and filtering. SNVs over eight samples were called using GATK HaplotypeCaller (GATK) and Bcftools mpileup (BCF tools) for each set of biological replicates from seven branches and one cambium independently, generating potential SNVs for each set of replicates and for each SNP caller (G1 and G2 for GATK, B1 and B2 for BCF tools). For BCF tools, we set three thresholds (T40, T30, and T20) with different base quality (BQ) and mapping quality (MQ). SNVs detected in both replicates were extracted for each SNP callers and generated potential SNVs for each SNP caller, SNVGATK for GATK and SNVBCF for Bcftools with three thresholds. These SNVs were filtered by extracting SNVs detected in both SNP callers, generating potential SNVs for each threshold: SNVT40, SNVT30, and SNVT20. Finally, SNVs detected at any of the three thresholds were extracted to obtain candidate SNVs. We checked the candidate SNVs manually and obtained a final set of SNVs, SNVFinal.

Figure 1—figure supplement 3
Synteny relationship between S. laevis and S. leprosula.

The collinear blocks within the genomes of S. leprosula and S. laevis were displayed by gray lines, with orange objects representing the contigs of the S. leprosula genome and green objects denoting the contigs of the S. laevis genome. In cases where the direction of a contig in S. laevis was partly different from that in S. leprosula, the contigs of the S. laevis genome were colored in red, otherwise it is indicated as green.

The relationship between the physical distance and the numbers of SNVs.

(a) Linear regression of the number of SNVs against the pair-wise distance between branch tipcs with an intercept of 0 for each tree (S1: blue, S2: right blue, F1: red, and F2: orange). Shaded areas represent 95% confidence intervals of regression lines. Regression coefficients are listed in Supplementary file 1h. (b) Comparison of somatic mutation rates per nucleotide per growth and per year across four tropical trees. Bars indicate 95% confidence intervals.

Figure 3 with 4 supplements
Mutational spectra of somatic SNVs.

Somatic mutation spectra in S. laevis (upper panel) and S. leprosula (lower panel). The horizontal axis shows 96 mutation types on a trinucleotide context, coloured by base substitution type. Different colours within each bar indicate complementary bases. For each species, the data from two trees (S1 and S2 for S. laevis and F1 and F2 for S. leprosula) were pooled to calculate the fraction of each mutated triplet.

Figure 3—figure supplement 1
Mutational spectra of somatic and inter-individual substitutions.

(a) Somatic mutation spectra for S1 and S2 individuals in S. laevis. (b), Somatic mutation spectra for F1 and F2 individuals in S. leprosula. (c), Inter-individual SNVs between S1 and S2 (upper panel) and between F1 and F2 (lower panel). The horizontal axis shows 96 mutation types on a trinucleotide context, coloured by base substitution type. Different colours in each bar indicate complementary bases.

Figure 3—figure supplement 2
Manual confirmation of candidate SNVs.

(a) SNVs that passed manual confirmation. (b) SNVs that were removed due to their fixed heterozygote pattern. (c) SNVs that have been removed due to the difference between the observed pattern and the genotyping call. (d) SNVs that were removed due to the presence of another allele with multiple reads.

Figure 3—figure supplement 3
Proportion of potential false positive SNVs for S. laevis (S1, S2) and S. leprosula (F1, F2).

Potential false positive SNVs was identified as the subset of candidate SNVs that were not included in the final set for each threshold (T40, T30, and T20). This subset was then divided by the total number of potential SNVs at each threshold to determine the proportion.

Figure 3—figure supplement 4
Proportion of potential false negative SNVs for S. laevis (S1, S2) and S. leprosula (F1, F2).

Potential false negative SNVs was identified as the subset of potential SNVs present in the final set but excluded from the candidate SNVs for each threshold (T40, T30, and T20). This subset was then divided by the total number of potential SNVs at each threshold to calculate the proportion.

Figure 4 with 1 supplement
Detecting selection on somatic and inter-individual SNVs.

(a) An illustration of somatic and inter-individual SNVs. Different colours indicate different genotypes. (b) Expected (Exp.) and observed (Obs.) rates of somatic non-synonymous substitutions. (c) Expected (Exp.) and observed (Obs.) rates of inter-individual non-synonymous substitutions. (d) The difference between the fractions of inter-individual and somatic substitutions spectra in S. laevis (upper panel) and S. leprosula (lower panel). The positive and negative values are plotted in different colours. The horizontal axis shows 96 mutation types on a trinucleotide context, coloured by base substitution type.

Figure 4—figure supplement 1
A calculation scheme for the expected rate of non-synonymous mutation.

The possible numbers of synonymous (NS), missense (NM), and nonsense (NNon) mutations were counted for each of six base substitution classes from all possible mutations in CDS of length Lcds and used for the calculation of expected rate of non-synonymous mutation. For non-synonymous mutation, we pooled the number for missense and nonsense mutations. The background mutation rate for each substitution class i (ri) is calculated from the observed somatic substitutions in intergenic regions.

Tables

Author response table 1
The branch-level increment of SNVs per meter.
Branch ID01_1122.1223_132
1-13.23
S. laevis1.25.1813.32
2.14.014.676.42
S12.23.784.426.197.18
3.14.335.066.896.816.43
3.24.305.026.856.756.3810.82
44.144.776.435.955.867.437.41
1.12.37
1.22.361.96
2-13.112.862.84
latevis2.23.052.792.773.91
S23.12.612.231.863.453.36
3.22.592.201.843.403.313.92
42.772.432.413.733.633.743.67
1.11.21
1.21.111.18
- leprosula2-11.501.351.19
F1_(2)^(2)21.571.461.282.22
3=11.271.050.961.411.52
321.190.950.871.271.382.41
41.070.830.781.051.140.970.82
1.10.83
120.552.07
2-10.341.100.54
S. leprosula2.20.391.190.621.25
F?3+10.441.240.680.460.53
3.20.471.320.740.500.581.15
40.601.570.950.710.780.971.05

Additional files

Supplementary file 1

Supplementary tables.

(a) Mean annual increment (MAI) of diameter at breast height (DBH). (b) Matrix of physical distances (m) between sampling positions and the number of SNVs indicated in parentheses. (c) Summary statistics of the studied trees. Height and DBH were directly measured for two individuals of S. laevis and S. leprosula. Age was estimated as DBH divided by a mean annual increment (MAI). (d) Summary statistics of genome assemblies for S. laevis and S. leprosula. We assembled the genome using DNA extracted from the apical leaf at branch 1–1 of the tallest individual of each species (S1 and F1). Summary statistics of genome assemblies are listed here. (e) Summary statistics of whole genome sequencing. (f) The number of candidate SNVs during each step of the filtering process. (g) Assessment of candidate SNVs using amplicon sequencing. (h) Somatic mutation rates. The somatic mutation rate per nucleotide per meter was estimated as μg=b2×R, where b indicates the slope of linear regression. The somatic mutation per nucleotide per year (μy) was estimated as μy=M2×R×A, where M indicates the total number of SNVs accumulated from the base to the branch tip and A represents tree age, respectively. R denotes the number of callable sites. (i) Cosine similarity of mutation spectra between Shorea trees and humans. (j) Results of the binomial test for selection on somatic and inter-individual SNVs. To test whether somatic and inter-individual SNVs are subject to selection, we calculated the expected rate of non-synonymous mutation. Given the observed number of non-synonymous and synonymous mutations, we rejected the null hypothesis of neutral selection using a binomial test with the significance level of 5%. pN_expected and pN_observed represent the expected and observed rate of non-synonymous substitutions. (k) The final set of SNVs. (l) Fractions of synonymous, missense, and nonsense substitutions. (m) Somatic mutation rates for six substitution classes. Somatic mutation rates for six substitution classes were calculated based on the observed number of SNVs both from the intergenic region and the whole genome. S1 +S2 and F1 +F2 represent the use of pooled data from two individuals for each species: S. laevis (S1, S2) and S. leprosula (F1, F2). The values based on the pooled data (indicated in bold type) were used to calculate the expected rate of non-synonymous mutation. (n) List of genes with somatic SNVs.

https://cdn.elifesciences.org/articles/88456/elife-88456-supp1-v1.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/88456/elife-88456-mdarchecklist1-v1.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Akiko Satake
  2. Ryosuke Imai
  3. Takeshi Fujino
  4. Sou Tomimoto
  5. Kayoko Ohta
  6. Mohammad Na'iem
  7. Sapto Indrioko
  8. Widiyatno Widiyatno
  9. Susilo Purnomo
  10. Almudena Molla Morales
  11. Viktoria Nizhynska
  12. Naoki Tani
  13. Yoshihisa Suyama
  14. Eriko Sasaki
  15. Masahiro Kasahara
(2024)
Somatic mutation rates scale with time not growth rate in long-lived tropical trees
eLife 12:RP88456.
https://doi.org/10.7554/eLife.88456.3