Joint inference of evolutionary transitions to self-fertilization and demographic history using whole-genome sequences

  1. Stefan Strütt
  2. Thibaut Sellinger
  3. Sylvain Glémin
  4. Aurélien Tellier  Is a corresponding author
  5. Stefan Laurent  Is a corresponding author
  1. Max Planck Institute for Plant Breeding Research, Germany
  2. Department of Life Science Systems, Technical University of Munich, Germany
  3. Department of Environment and Biodiversity, Paris Lodron University of Salzburg, Austria
  4. Université Rennes 1, CNRS, ECOBIO, France
  5. Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Sweden
5 figures, 1 table and 2 additional files

Figures

Figure 1 with 3 supplements
Consequences of a transition to selfing on the genealogies of simulated chromosomes.

(A) Joint and marginal distributions of ages (TMRCA in generations on a log10 scale) and lengths of TMRCA-segments (in bp on a log10 scale) in a selfing population (σ=0.95) with a stepwise change …

Figure 1—figure supplement 1
Probability of recombination events over time under different reproductive histories.

Probability of a break in the genealogy to occur in a coalescent tree of sample size two in the absence of selfing (red); in presence of 90% selfing (orange); in the case of a transition from …

Figure 1—figure supplement 2
Comparison of the joint distributions of TMRCA and lengths of TMRCA-segments (TL-distribution) under three different simulation approaches.

(A) Explicit selfing implemented in a forward-in-time Wright-Fisher model (slim3). Population size is constant and selfing rate shifts from outcrossing (σ=0, green) to predominant selfing (σ=0.95, …

Figure 1—figure supplement 3
Consequences of a transition to selfing on the genealogies of simulated chromosomes for different ages of the transition.

(A–I) Joint and marginal distributions of ages in generations and lengths of TMRCA-segments in a population with a constant population size and a shift from outcrossing (green) to predominant …

Figure 2 with 8 supplements
Performance of teSMC on simulated polymorphism data.

Inference of times of transition from outcrossing (σ=0.1) to predominant selfing (σ=0.99) using neutral simulations. The x-axes represent the true value of tσ in units of log10(generations) and the y

Figure 2—figure supplement 1
Theoretical convergence of teSMC under complex demography.

Best-case convergence of teSMC using 10 sequences (i.e. haploid genomes) of 100 Mb (green) when population undergoes a bottleneck (true sizes are indicated in gray) with either variation of selfing …

Figure 2—figure supplement 2
Best-case convergence of teSMC for different amount of data.

Best-case convergence of teSMC using different combinations of sample sizes (n=2, n=5, or n=20 sequences; i.e. haploid genomes) and sequence lengths (L=10 Mb or L=100 Mb, see legend). Population size is …

Figure 2—figure supplement 3
Mis-inference of population sizes when transitions to selfing are not accounted for.

Comparisons between true (black lines) and estimated selfing rates and population sizes estimated by teSMC for 10 replicates. Here, simulations were done using a constant population size (N=40,000) …

Figure 2—figure supplement 4
Inference of population sizes and selfing rates estimated by teSMC when both parameters change over time.

(AP) Comparisons between true (black lines) and estimated selfing rates and population sizes estimated by teSMC for 10 replicates. Here, simulations were done as in Figure 2—figure supplement 3

Figure 2—figure supplement 5
Estimated population sizes by teSMC with variable mutation and recombination rates along the genome.

Estimated population sizes by teSMC using 20 sequences (i.e. haploid genomes) of length 5 Mb (red) when population size is constant and set to 100,000 (black line) with a constant selfing value of …

Figure 2—figure supplement 6
Estimated selfing rates through time by teSMC with variable mutation and recombination rates along the genome.

Estimated selfing rates through time estimated by teSMC using 20 sequences (i.e. haploid genomes) of length 5 Mb (red) when population size is constant and set to 100,000 (black line) with a …

Figure 2—figure supplement 7
Best-case convergence of teSMC under complex selfing transitions.

Best-case convergence of teSMC under four different scenarios of selfing transition. (A) Slow transition from outcrossing to selfing. (B) Transition from selfing to outcrossing. (C) Transition from …

Figure 2—figure supplement 8
Performance of teSMC under complex selfing transitions.

Performance of teSMC on simulated sequence data under four different scenarios of selfing variations. (A) Slow transition from outcrossing to selfing. (B) Transition from selfing to outcrossing. (C) …

Figure 3 with 2 supplements
Approximate Bayesian computation (ABC) model choice performance analysis.

(A) Demographic model 1 in the model choice analysis: one population with a single transition from predominant selfing to predominant outcrossing. (B) Demographic model 2 in the model choice …

Figure 3—figure supplement 1
Pairwise diversity transition matrices used in tsABC.

(A) The transition matrix of pairwise diversity (TMwin) between adjacent non-overlapping 10 kb windows measured for 1 Mb of data simulated with the population-size change model (Figure 3B) in an …

Figure 3—figure supplement 2
Approximate Bayesian computation (ABC) performance analysis.

Parameter re-estimation of the three additional parameters of the model described in Figure 3. (A, B, C) Re-estimation of the population size on 100 datasets simulated under a model with constant …

Figure 4 with 1 supplement
Accuracy of teSMC and tsABC in the presence of background selection (BGS).

Inference of times of transition from outcrossing (σ=0.1) to predominant selfing (σ=0.99) using (A) teSMC and (B) tsABC. Simulations were done under constant population size and negative selection …

Figure 4—figure supplement 1
Accuracy of tsABC in the presence of background selection (BGS).

Parameter re-estimation of the three remaining parameters of the model (see Figure 3A). (A, B, C) Re-estimation of the population size on 100 datasets simulated under a model with constant …

Figure 5 with 2 supplements
Inference of the time of transition from outcrossing to selfing in A. thaliana.

(A) Inferred transitions from outcrossing to selfing for three independent genetic clusters of A. thaliana from the 1001 genomes project (CEU, IBnr, Relict) using tsABC and teSMC. The 95% CI (CEU) …

Figure 5—figure supplement 1
Genomic regions of A. thaliana genome (TAIR10) used for the teSMC and tsABC analyses.

Panels show, for the five chromosomes of A. thaliana, the nucleotide diversity (pi) calculated with tskit using a sliding window of 100 kb. Results are shown only for the CEU sample used (n=99) for …

Figure 5—figure supplement 2
Inference of the time of transition from outcrossing to selfing in A. thaliana using two different sets of genomic regions.

(A, B) Inferred transitions from outcrossing to selfing for three ancestry groups of A. thaliana from the 1001 genomes project (CEU: central European, IBnr: Iberia non-relict, Relict) using teSMC. (C…

Tables

Table 1
Estimated times of transitions from predominant outcrossing to predominant selfing in A. thaliana.

Estimations were conducted for three different ancestry groups: central Europe (CEU), Iberian non-relicts (IBnr), and Relicts using both teSMC and tsABC. The 95% CI of all jointly inferred …

MethodPopulationSample sizeMode95% credibility interval
teSMCCEU20697,490NANA
teSMCIBnr20713,421NANA
teSMCRelicts17749,668NANA
tsABCCEU99707,995443,486973,841
tsABCIBnr66756,976397,049988,708
tsABCRelicts17592,321386,406934,499

Additional files

Supplementary file 1

Supplementary tables.

Parameters for simulated datasets to investigate the performance of tsABC. Table S2: Parameter priors used for the performance analysis of tsABC. Table S3: Parameter priors used to estimate the transition from outcrossing to selfing in A. thaliana. Table S4: A. thaliana samples with >95% cluster membership (https://1001genomes.github.io/admixture-map/) used to estimate the transition from outcrossing to selfing obtained from the 1001 genomes website (https://1001genomes.org/data/GMI-MPI/releases/v3.1/SNP_matrix_imputed_hdf5/1001_SNP_MATRIX.tar.gz). Table S5: Genomic regions of A. thaliana in TAIR10 used for the tsABC and teSMC analyses. Table S6: Jointly estimated parameters in the context of transitions from predominant outcrossing to predominant selfing in A. thaliana.

https://cdn.elifesciences.org/articles/82384/elife-82384-supp1-v2.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/82384/elife-82384-mdarchecklist1-v2.docx

Download links