Measures of genetic diversification in somatic tissues at bulk and single-cell resolution

  1. Marius E Moeller
  2. Nathaniel V Mon Père
  3. Benjamin Werner  Is a corresponding author
  4. Weini Huang  Is a corresponding author
  1. Department of Mathematics, Queen Mary University of London, United Kingdom
  2. Evolutionary Dynamics Group, Centre for Cancer Genomics and Computational Biology, Barts Cancer Centre, Queen Mary University of London, United Kingdom
  3. Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles, Belgium
  4. Group of Theoretical Biology, The State Key Laboratory of Biocontrol, School of Life Science, Sun Yat-sen University, China
4 figures, 2 tables and 1 additional file

Figures

The distribution of variant allele frequencies changes with the growth phases and by sampling.

(a) In the current population, cells divide symmetrically into two daughter cells or asymmetrically with only one daughter cell kept in the focused population. All other events are mathematically equivalent to and are treated as a part of cell death. (b) The rates of symmetric and asymmetric division change during the population growth and lead to a dynamic distribution of variant allele frequencies. (c) The observed VAF distribution is shifted again during sampling compared to the VAF of the whole population – a fact should be considered when inferring population properties through genetic data.

Figure 2 with 1 supplement
Bulk sequencing based variant allele frequency (VAF) and mutation rate inferences in healthy esophagus.

(a) Expected VAF distributions from evolving Equation 1 to different time points for a population with an initial exponential growth phase and subsequent constant population phase (mature size N=103). Once the population reaches the maximum carrying capacity, the distribution moves from a 1/f2 growing population shape (purple) to a 1/f constant population shape (green). Note that the shift slows considerably at older age. (b) VAF from healthy tissue in the esophagus of nine individuals sorted into age brackets. The youngest bracket, 20–39, is closer to the developmental 1/f2 scaling. The older age brackets are both close to the constant population 1/f scaling, resembling the theoretical expectations. (c) Inferred mutation rates increase linearly with age. (d) Simulations of slowly growing stem cell populations reveal that mutation rates appear to increase with age, although the true underlying per division mutation rate remaining constant (see Figure 2—figure supplement 1 as well).

Figure 2—figure supplement 1
Comparison of an exponential and logistic growth model.

(a) Population size over time for exponential (dotted) and logistic (dashed) growth functions. At the point of maturity (t=20) the exponential function reaches the population capacity (NM=10000), and the logistic function equals 0.99NM. The time points at which the VAF spectra are measured are indicated by solid vertical lines: t1=25 (blue) and t2=50 (orange). (b) Comparison of the expected variant allele frequency (VAF) spectra, calculated with Equation 1, in the exponential (dotted) and logistic (dashed) growth models measured at the time points t1 and t2. For reference, the theoretical predictions for an exponentially growing population without cell death and a constant population in the long time limit are shown as solid purple and green lines, respectively.

Figure 3 with 3 supplements
Inference of evolutionary parameters on simulated stem cell populations.

Simulated populations were run up to age 59, growing exponentially from a single-cell to constant size NM=10000 at age tM=5, with mutation rate μ=1.2 and division rates λ=5 and p=0.4. Where sampling is mentioned, the sample size 89 was taken. (a) The single-cell mutational burden distribution. The compound Poisson distribution (dashed line) matches the burden distribution when averaging over multiple independently evolved populations (filled curve). (b) Distribution of estimated mutation rates from 10’000 individual simulations, obtained from burden distributions of the complete populations (blue) as well as sampled sets of cells (orange). Because the expected mutational burden distribution is unaltered by sampling, the expected estimate of the mutation rate from Equation 5 remains unchanged: E(μ~pop)=E(μ~sample). However, sampling increases the noise on the observed burden distribution, which results in a higher error margin of the estimate: σ(μ~pop)<σ(μ~sample). (c) VAF spectra measured in the complete population (blue) and a sampled set of cells (orange). In contrast with the mutational burden distribution, strong sampling changes the shape of the expected distribution. A single simulation result is shown (diamonds) alongside the theoretically predicted expected values for both the total and sampled populations (Equations 1 and 12) (dashed line) and the average across 100 simulations (solid line). (d) Distribution of NM and p inference results for 100 simulated and sampled populations, through estimation of μ~ and λ~ from the single-cell burden distribution and fitting the number of lowest frequency (1/S) mutations to the theoretical prediction in Equation 1 (see Figure 3—figure supplements 13 as well).

Figure 3—figure supplement 1
Stochastic simulations of the Variance/Mean of the mutational burden distribution over time for a per cell division mutation rate of μ=1.3 and varying stem cell population size N and asymmetric division probability p.

Stochastic fluctuations are pronounced for small population size N and low asymmetric division probability p.

Figure 3—figure supplement 2
Likelihood of the Variance/Mean to be in the interval 3<θ<5 for a per division mutation rate of μ=1.3.
Figure 3—figure supplement 3
Likelihood of the Variance/Mean to be in the interval 3<θ<5 for a per division mutation rate of μ=3.
Figure 4 with 1 supplement
Evolutionary inferences in single-cell hematopoietic stem cell (HSC) data.

(a) The single-cell mutational burden distribution of the data (bars) and the compound Poisson distribution obtained from its mean and variance, used to obtain the estimated per division mutation rate μ~. (b) Distribution of mutation frequencies of the data and theoretically predicted average fitted to only the lowest frequency (1/S) data point. (c) Difference Δvf between the measured value of the VAF spectrum at the lowest frequency (1/S) and its prediction from Equation 1, for varying total population size N and asymmetric division proportion p, with fixed maturation time tM=5 and operational hematopoietic population size NH=50. The solid line denotes the plane of best fit where this difference is 0. (d) Maximally inferred population size N (taking p=0 in (c)) for variation of the maturation time tN and the operational hematopoietic population size NH (see Figure 4—figure supplement 1 as well).

Figure 4—figure supplement 1
The standard deviation on the variant allele frequency (VAF) spectrum increases for higher frequencies.

(a) The VAF spectrum Vf averaged across 100 simulations of a population evolved according to the model described in section 1.1. The standard deviation from the mean is shown as the band Vf±σf around the average. (b) The standard deviation across all simulations for each frequency f=1/N,2/N,, scaled by the average spectrum Vf.

Tables

Table 1
Evolutionary parameters appearing in the model system.
SymbolDescriptionUnits
NMCarrying capacity of the mature population
tMAge when the cell population reaches mature sizeyears
NHPopulation size at homeostatic divisions(start of the mixed-growth phase)
γSymmetric division rates in early developmental phase/year
ρSymmetric division rate in homeostatic state/year
ΦAsymmetric division rate in homeostatic state/year
μMutation rate/division/daughter
Table 2
Evolutionary parameters appearing in the analytical derivations of the expected VAF distribution in the Moran and pure-birth models.
SymbolDescription
NTotal number of cells
kAbundance or number of cells < N
Vk(t)Number of mutations with abundance k
PkLikelihood for a mutation with abundance k to increase or decrease
Ck(t)The average number of mutations with abundance k that increase or decrease
dVk(t)dtAverage change of Vk(t) per time step
μMutation rate per daughter cell

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Marius E Moeller
  2. Nathaniel V Mon Père
  3. Benjamin Werner
  4. Weini Huang
(2024)
Measures of genetic diversification in somatic tissues at bulk and single-cell resolution
eLife 12:RP89780.
https://doi.org/10.7554/eLife.89780.3