Down the Penrose stairs, or how selection for fewer recombination hotspots maintains their existence

  1. Zachary Baker  Is a corresponding author
  2. Molly Przeworski
  3. Guy Sella
  1. Department of Systems Biology, Columbia University, United States
  2. Department of Biological Sciences, Columbia University, United States
  3. Program for Mathematical Genomics, Columbia University, United States
10 figures, 1 table and 13 additional files

Figures

Overview of the Model.
Fitness in the model with one heat.

(A) Cartoon depicting the qualitative change in fitness as a function of time under different models, assuming that all individuals are homozygous for a single PRDM9 allele (without turnover) and that all binding sites have the same binding affinity. For visualization purposes, we present results for a linear function. (B) The probability of an individual binding site being bound (thicker lines) or of a given locus being bound symmetrically (thinner lines) if all sites are very weak and non-competitive (k=106; blue) or very strong and competitive (k=1; red). The values shown were computed using Equation 1, Equation S2.2.2 where, for sake of comparison, in both cases we set the number of PRDM9 molecules such that 5000 sites would be bound in the presence of 20,000 binding sites (roughly consistent with observations in mice Parvanov et al., 2017; Grey et al., 2017). When binding sites are very weak, most PRDM9 molecules are not bound and therefore our model behaves as if there were no competition between binding sites.

Figure 3 with 1 supplement
Fitness in the model with two heats.

Fitness as a function of the number of hotspots, for an individual homozygous for a PRDM9 allele (blue), or heterozygous for two PRDM9 alleles with the same number of hotspots (red), when considering (A) weak hotspots or (B) strong ones. Fitness was calculated using Equations 5 and 6, assuming a backdrop of weak binding sites that would bind 99% of PRDM9 molecules in the absence of hotspots. Vertical dashed lines indicate the number of hotspots that maximize the fitness of individuals homozygous (blue) or heterozygous (red) for the allele. The area shaded in light red indicates the number of hotspots where the fitness of an individual homozygous for PRDM9 has lower fitness than the maximal fitness of a heterozygous individual that carries one copy of the same PRDM9 allele (where the other allele has the optimal number of hotspots for heterozygotes). In both panels, the region shaded in light red on the left reflects the case wherein the homozygous individual will have reduced fitness as a consequence of PRDM9 binding to weak binding sites. In panel B, the region on the right reflects the case wherein the homozygous individual will have reduced fitness as a consequence of limited symmetric binding at hotspots. The numbered arrows illustrate the behavior of the simplified evolutionary dynamic described in the text.

Figure 3—figure supplement 1
The probability that a hotspot has been symmetrically bound (red), and the probability (or proportion) of DSBs localizing to hotspots (blue), when considering hotspots with (A) a relatively strong dissociation constant, or (B) a weaker one, under the two-heat model in individuals homozygous (solid lines) or heterozygous (dashed lines) for PRDM9.
The dynamics of the model with two heats.

The mean number of hotspots (n1), diversity at PRDM9 (π), and mean fitness (W), as a function of time. We show four cases: with weak and strong hotspots (top and bottom rows, respectively) and large and small population size (left and right column, respectively). Simulations (as detailed in the model section) were run with the specified population sizes and hotspot dissociation constants, with other parameter values detailed in Table 1. The time (x-axis) in each plot has been scaled by the duration of PRDM9 turnover to allow for ~10 ‘cycles’ (see Figure 3), which span ~103 times more generations in small populations compared to large ones. The highlighted range of hotspot numbers in which a PRDM9 allele is ‘susceptible to invasion’ corresponds to the regions shaded in red in Figure 3. In turn, the highlighted range of hotspot numbers designated as ‘optimal hotspots’ corresponds to the range between the number that optimizes fitness in heterozygotes and the number that optimizes fitness in homozygotes (between the dashed lines shown in Figure 3). See Appendix 4 for analogous results assuming an alternative value for the number of PRDM9 molecules expressed during meiosis (PT=1000), and Appendix 5 for analogous results assuming a tight unimodal distribution of the number of hotspots associated with newly arising PRDM9 alleles. This figure was generated using Figure 4—source data 1–4.

Figure 4—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (pi) per generation, across 55,000 simulated generations of the two-heat model with a large population size (N = 106) and weak hotspots (k1 = 50).

This data was used to generate Figure 4 and Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig4-data1-v2.zip
Figure 4—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1 = 50).

This data was used to generate Figure 4 and Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig4-data2-v2.zip
Figure 4—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 55,000 simulated generations of the two-heat model with a large population size (N = 106) and strong hotspots (k1 = 5).

This data was used to generate Figure 4 and Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig4-data3-v2.zip
Figure 4—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1 = 5).

This data was used to generate Figure 4 and Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig4-data4-v2.zip
The distribution of initial and final numbers of hotspots among segregating PRDM9 alleles.

The relative densities of initial and final numbers of hotspots (corresponding to invading and exiting alleles, respectively) for different dissociation constants (rows) and population sizes (line colors) obtained from simulations of five million PRDM9 allele trajectories with each set of parameters (see the model section for details about the simulations and Table 1 for other parameter values). Shaded regions are defined the same way as in Figure 4. This figure was generated using Figure 5—source data 1–2.

Figure 5—source data 1

The distribution of the initial number of hotspots (n1) among PRDM9 alleles, weighted by their mean allele frequency and sojourn time.

The first two columns represent population size (N) and the dissociation constant at hotspots (k1) respectively. The remaining 50 columns describe the distribution in bins, with each bin corresponding to a range of 100 initial hotspots (e.g., the first column represents the probability of drawing an allele from the population at random which initially began with between 1 and 100 hotspots). This data was used to generate Figure 5.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig5-data1-v2.zip
Figure 5—source data 2

The distribution of the number of hotspots (n1) among exiting PRDM9 alleles, weighted by their mean allele frequency and sojourn time.

The first two columns represent population size (N) and the dissociation constant at hotspots (k1) respectively. The remaining 100 columns describe the distribution in bins, with each bin corresponding to a range of 50 initial hotspots (e.g., the first column represents the probability of drawing an allele from the population at random which would ultimately leave the population with between 1 and 50 hotspots). This data was used to generate Figure 5.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig5-data2-v2.zip
Figure 6 with 2 supplements
The dependence of key quantities on population size and hotspot heat.

(A) Mean number of hotspots among incoming and exiting alleles (corresponding to the distributions shown in Figure 5). (B) Average turnover time of segregating PRDM9 alleles (see text). (C) Mean and interquartile range of diversity at PRDM9 over time. (D) Mean and interquartile range of fitness in the population. Each point in these graphs is based on simulations of 5x106 PRDM9 alleles (see model section for details) with the specified parameter values and the other model parameter values specified in Table 1. See Appendix 4 for analogous results, assuming alternative levels of PRDM9 expression. This figure was generated using Figure 6—source data 1 and 2.

Figure 6—source data 1

Summary statistics from simulations for a range of population sizes (N) and hotspot dissociation constants (k1), including mean turnover times, mean initial numbers of hotspots, and mean number of hotspots for exiting PRDM9 alleles (each weighted by the sojourn time of PRDM9 alleles).

This data was used to generate Figure 6.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-data1-v2.zip
Figure 6—source data 2

Additional summary statistics from simulations for a range of population sizes (N) and hotspot dissociation constants (k1), including mean diversity at PRDM9 (π) and population fitness (W), and the inter-quartile ranges of each.

This data was used to generate Figure 6.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-data2-v2.zip
Figure 6—figure supplement 1
The effect of population size on the rate of turnover (A) and diversity (B) at PRDM9 while keeping population-scaled mutation rates at PRDM9, binding sites, or both, constant (when k1=20).

This figure was generated using Figure 6—figure supplement 1—source data 1–2.

Figure 6—figure supplement 1—source data 1

Mean PRDM9 turnover times for a range of population sizes (N = 103 to 106) when considering moderately strong hotspots (k1=20), under normal parameters, when the population-scaled mutation rate at PRDM9 has been normalized to that of a moderate population size (N=104.5), when the population-scaled mutation rate at PRDM9 binding sites has been normalized to that of a moderate population size (N=104.5), and when both mutation rates have been so normalized.

This data was used to generate Figure 6—figure supplement 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp1-data1-v2.zip
Figure 6—figure supplement 1—source data 2

Mean values of diversity at the PRDM9 locus (π) for a range of population sizes (N = 103 to 106) when considering moderately strong hotspots (k1=20), under normal parameters, when the population-scaled mutation rate at PRDM9 has been normalized to that of a moderate population size (N=104.5), when the population-scaled mutation rate at PRDM9 binding sites has been normalized to that of a moderate population size (N=104.5), and when both mutation rates have been so normalized.

This data was used to generate Figure 6—figure supplement 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp1-data2-v2.zip
Figure 6—figure supplement 2
The effect of population-scaled mutation rates on the dynamics of the model when hotspots are cold (k1=50) or hot (k1=5).

Values for population size (N) and mutation rates at PRDM9 binding sites (Nμ) and at the PRDM9 locus (Nν) used in each simulation are shown to the right of each row. Values shown in black indicate those typical for small populations (N=103) and values shown in red indicate those typical for large populations (N=106). This figure was generated using Figure 4—source data 1–4 and Figure 6—figure supplement 2—source data 1–6.

Figure 6—figure supplement 2—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1=50), when the population-scaled mutation rate at PRDM9 was adjusted to match that of a large population (with N = 106 ; ν = 10-2).

This data was used to generate Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp2-data1-v2.zip
Figure 6—figure supplement 2—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1=5), when the population-scaled mutation rate at PRDM9 was adjusted to match that of a large population (with N = 106 ; ν = 10-2).

This data was used to generate Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp2-data2-v2.zip
Figure 6—figure supplement 2—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1=50), when the population-scaled mutation rate at PRDM9 binding sites was adjusted to match that of a large population (with N = 106 ; μ = 1.25*10-4).

This data was used to generate Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp2-data3-v2.zip
Figure 6—figure supplement 2—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1=5), when the population-scaled mutation rate at PRDM9 binding sites was adjusted to match that of a large population (with N = 106 ; μ = 1.25*10-4).

This data was used to generate Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp2-data4-v2.zip
Figure 6—figure supplement 2—source data 5

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 55,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1=50), when the population-scaled mutation rate at both PRDM9 and PRDM9 binding sites was adjusted to match that of a large population (with N = 106 ; ν = 10-2 ; μ = 1.25*10-4).

This data was used to generate Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp2-data5-v2.zip
Figure 6—figure supplement 2—source data 6

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 55,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1=5), when the population-scaled mutation rate at both PRDM9 and PRDM9 binding sites was adjusted to match that of a large population (with N = 106 ; ν = 10-2 ; μ = 1.25*10-4).

This data was used to generate Figure 6—figure supplement 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-fig6-figsupp2-data6-v2.zip
Appendix 4—figure 1
Dynamics of the two-heat model when PT=1000.

The mean number of hotspots (n1), diversity at PRDM9 (π), and mean fitness (W), as a function of time. We show four cases: with weak and strong hotspots (top and bottom rows respectively) and large and small population size (left and right column respectively). See Figure 4 and Appendix 4 for details. This figure was generated using Appendix 4—figure 1—source data 1–4.

Appendix 4—figure 1—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (pi) per generation, across 2,000 simulated generations of the two-heat model with a large population size (N=106) and weak hotspots (K1=50), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data1-v2.zip
Appendix 4—figure 1—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1 = 50), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data2-v2.zip
Appendix 4—figure 1—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500 simulated generations of the two-heat model with a large population size (N = 106) and strong hotspots (k1 = 5), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data3-v2.zip
Appendix 4—figure 1—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1 = 5), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data4-v2.zip
Appendix 4—figure 2
The dependence of key quantities on PRDM9 expression level.

(A) The number of hotspots which optimizes fitness for an individual homozygous for PRDM9 alleles as a function of the dissociation constant at hotspots (k1) for different levels of PRDM9 expression (PT). Horizontal dashed lines indicate the expected optimal values as k1 approaches zero (PT/4). (B) Mean number of hotspots among incoming and exiting alleles in small or large population sizes (N=103 or 106) and when hotspots are relatively strong or weak (k1 = 5 or 50), as a function of PRDM9 expression level (PT). (C) Average turnover time of segregating PRDM9 alleles for different parameters as a function of PRDM9 expression level (PT). (D) Mean diversity at PRDM9 over time for different parameters as a function of PRDM9 expression level (PT). The weighting of PRDM9 alleles in B and C is as detailed in the main text. This figure was generated using Appendix 4—figure 2—source data 1.

Appendix 4—figure 2—source data 1

Summary statistics from simulations for large and small population sizes (N=103 or 106) and for small and large hotspot dissociation constants (k1=5 or 50), for a range of values of the expression level of PRDM9 (PT=500,1000,2500 or 5000), including mean turnover times, mean initial numbers of hotspots, and mean number of hotspots for exiting PRDM9 alleles (each weighted by the sojourn time of PRDM9 alleles), as well as the mean and interquartile range of diversity at PRDM9 (π). This data was used to generate Appendix 4—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig2-data1-v2.zip
Appendix 5—figure 1
Dynamics of the two-heat model when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

The mean number of hotspots (n1), diversity at PRDM9 (π), and mean fitness (W), as a function of time. We show four cases: with weak and strong hotspots (top and bottom rows respectively) and large and small population size (left and right column respectively). See Figure 4 and Appendix 5 for details. This figure was generated using Appendix 5—figure 1—source data 1–4.

Appendix 5—figure 1—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (pi) per generation, across 4,000 simulated generations of the two-heat model with a large population size (N = 106) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data1-v2.zip
Appendix 5—figure 1—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data2-v2.zip
Appendix 5—figure 1—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000 simulated generations of the two-heat model with a large population size (N = 106) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data3-v2.zip
Appendix 5—figure 1—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 1,000,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data4-v2.zip
Appendix 5—figure 2
Dynamics of the two-heat model when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

The mean number of hotspots (n1), diversity at PRDM9 (π), and mean fitness (W), as a function of time. We show four cases: with weak and strong hotspots (top and bottom rows respectively) and large and small population size (left and right column respectively). See Figure 4 and Appendix 5 for details. This figure was generated using Appendix 5—figure 2—source data 1–4.

Appendix 5—figure 2—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (pi) per generation, across 4,000 simulated generations of the two-heat model with a large population size (N = 106) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data1-v2.zip
Appendix 5—figure 2—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data2-v2.zip
Appendix 5—figure 2—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500 simulated generations of the two-heat model with a large population size (N = 106) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data3-v2.zip
Appendix 5—figure 2—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 1,000,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data4-v2.zip

Tables

Table 1
Parameters and variables of the model.
ParameterDescriptionValue
NDiploid population size103–106
μMutation rate from hot to cold alleles at PRDM9 binding sites per generation1.25x10–7
νMutation rate at the PRDM9 locus per generation10–5
DNumber of DSBs initiated per meiosis300
PTTotal number of PRDM9 molecules expressed per meiosis5,000
kiDissociation constant of PRDM9 binding sites with affinity i5–50
BProbability that a conversion tract spans the PRDM9 binding motif0.7
rProportion of the genome corresponding to the smallest chromosome1/40
VariableDescription
nijNumber of PRDM9 binding sites with affinity i, recognized by PRDM9 allele j, across the genome
Hij,kProbability that a given site with binding affinity i, recognized by PRDM9 allele j, is bound by PRDM9 in an individual with PRDM9 alleles j and k (Equation 1)
PBj,kTotal number of bound cognate PRDM9 molecules in an individual with PRDM9 alleles j and k
PFj,kNumber of free cognate PRDM9 molecules in an individual with PRDM9 alleles j and k per meiosis (Equation 2)
cj,kProbability that any PRDM9-bound site experiences a DSB in an individual with PRDM9 alleles j and k (Equation 3)
αij,kProbability of a PRDM9 binding site with affinity i, recognized by PRDM9 allele j, being symmetrically bound by PRDM9 and experiencing a DSB in an individual with PRDM9 alleles j and k (Equation 4)
Wj,kThe fitness of an individual with PRDM9 alleles j and k (Equation 5 for homozygotes, Equation 6 for heterozygotes)
WjThe marginal fitness of PRDM9 allele j across possible genotypes (Equation 7)
fjThe allele frequency of PRDM9 allele j (Equation 8)
gij,kProbability or rate of a PRDM9 binding site with affinity i, recognized by PRDM9 allele j, experiencing gene conversion spanning the PRDM9 binding motif, in an individual with PRDM9 alleles j and k (Equation 9)
gijProbability or rate of a PRDM9 binding site with affinity i, recognized by PRDM9 allele j, experiencing gene conversion spanning the PRDM9 binding motif, across the population (Equation 10)

Additional files

Appendix 4—figure 1—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (pi) per generation, across 2,000 simulated generations of the two-heat model with a large population size (N=106) and weak hotspots (K1=50), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data1-v2.zip
Appendix 4—figure 1—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1 = 50), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data2-v2.zip
Appendix 4—figure 1—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500 simulated generations of the two-heat model with a large population size (N = 106) and strong hotspots (k1 = 5), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data3-v2.zip
Appendix 4—figure 1—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1 = 5), and a smaller number of expressed PRDM9 molecules (PT=1000).

This data was used to generate Appendix 4—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig1-data4-v2.zip
Appendix 4—figure 2—source data 1

Summary statistics from simulations for large and small population sizes (N=103 or 106) and for small and large hotspot dissociation constants (k1=5 or 50), for a range of values of the expression level of PRDM9 (PT=500,1000,2500 or 5000), including mean turnover times, mean initial numbers of hotspots, and mean number of hotspots for exiting PRDM9 alleles (each weighted by the sojourn time of PRDM9 alleles), as well as the mean and interquartile range of diversity at PRDM9 (π). This data was used to generate Appendix 4—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app4-fig2-data1-v2.zip
Appendix 5—figure 1—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (pi) per generation, across 4,000 simulated generations of the two-heat model with a large population size (N = 106) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data1-v2.zip
Appendix 5—figure 1—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data2-v2.zip
Appendix 5—figure 1—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000 simulated generations of the two-heat model with a large population size (N = 106) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data3-v2.zip
Appendix 5—figure 1—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 1,000,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 10 non-degenerate sites.

This data was used to generate Appendix 5—figure 1.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig1-data4-v2.zip
Appendix 5—figure 2—source data 1

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (pi) per generation, across 4,000 simulated generations of the two-heat model with a large population size (N = 106) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data1-v2.zip
Appendix 5—figure 2—source data 2

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 2,000,000 simulated generations of the two-heat model with a small population size (N = 103) and weak hotspots (k1 = 50), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data2-v2.zip
Appendix 5—figure 2—source data 3

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 500 simulated generations of the two-heat model with a large population size (N = 106) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data3-v2.zip
Appendix 5—figure 2—source data 4

The number of hotspots (n1), average population fitness (W), and diversity at the PRDM9 locus (π) per generation, across 1,000,000 simulated generations of the two-heat model with a small population size (N = 103) and strong hotspots (k1 = 5), when the distribution of the initial numbers arises from a motif with 11 non-degenerate sites.

This data was used to generate Appendix 5—figure 2.

https://cdn.elifesciences.org/articles/83769/elife-83769-app5-fig2-data4-v2.zip

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Zachary Baker
  2. Molly Przeworski
  3. Guy Sella
(2023)
Down the Penrose stairs, or how selection for fewer recombination hotspots maintains their existence
eLife 12:e83769.
https://doi.org/10.7554/eLife.83769