A paternal bias in germline mutation is widespread in amniotes and can arise independently of cell division numbers

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

In humans and other mammals, germline mutations are more likely to arise in fathers than in mothers. Although this sex bias has long been attributed to DNA replication errors in spermatogenesis, recent evidence from humans points to the importance of mutagenic processes that do not depend on cell division, calling into question our understanding of this basic phenomenon. Here, we infer the ratio of paternal-to-maternal mutations, α, in 42 species of amniotes, from putatively neutral substitution rates of sex chromosomes and autosomes. Despite marked differences in gametogenesis, physiologies and environments across species, fathers consistently contribute more mutations than mothers in all the species examined, including mammals, birds, and reptiles. In mammals, α is as high as 4 and correlates with generation times; in birds and snakes, α appears more stable around 2. These observations are consistent with a simple model, in which mutations accrue at equal rates in both sexes during early development and at a higher rate in the male germline after sexual differentiation, with a conserved paternal-to-maternal ratio across species. Thus, α may reflect the relative contributions of two or more developmental phases to total germline mutations, and is expected to depend on generation time even if mutations do not track cell divisions.

Editor's evaluation

This paper challenges a fundamental view concerning why males of most animals have a higher germline mutation rate than females. Evidence is provided to show that it is not simply the fact that males have more cell divisions in the germline, but instead, most of the mutations arise from a different balance of DNA damage vs. DNA repair. The case is supported by data from multiple species, from de novo mutation rate estimates from pedigrees, and from fits to a simple heuristic model. This work will be of interest to the broad field of DNA mutations and DNA repair, as well as evolutionary and phylogenomics researchers.

https://doi.org/10.7554/eLife.80008.sa0

Introduction

Humans tend to inherit more de novo mutations (DNMs) from their fathers than from their mothers. This phenomenon was first noted over 70 years ago, when JBS Haldane relied on the population frequency of hemophilia in order to infer that the DNM rate at the disease locus is substantially higher in fathers (Haldane, 1946). Work since then, particularly in molecular evolution, has confirmed a ‘male bias’ in mutation (henceforth paternal bias) (Makova and Li, 2002; Wolfe and Li, 2003; Li et al., 1996; Presgraves and Yi, 2009; Nachman and Crowell, 2000; Huang et al., 1997; Shimmin et al., 1993b; Chang et al., 1994), with estimates from human pedigrees indicating that, genome-wide, DNMs occur roughly four times more often on the paternal genome than on the maternal one (Kong et al., 2012; Francioli et al., 2015).

The textbook explanation for the paternal mutation bias is that it arises as a consequence of the vastly different numbers of cell divisions – and hence DNA replication cycles – necessary to produce sperm compared to oocytes (Crow, 2000; Drost and Lee, 1995; Penrose, 1955; Strachan and Read, 2018). In humans as in other mammals, oocytes are arrested in meiotic prophase I at birth, with no subsequent DNA replication in the mother’s life, whereas spermatogonia start dividing shortly before puberty and divide continuously throughout the reproductive life of the father (Drost and Lee, 1995; Guo et al., 2020). The observation that the number of DNMs increases with paternal age has been widely interpreted in this light, as evidence for DNA replication errors being the predominant source of germline mutation (Kong et al., 2012; Francioli et al., 2015; Goldmann et al., 2019; Jónsson et al., 2017).

A number of recent findings have called this view into question, however. First, analyses of large numbers of human pedigrees revealed an effect of maternal age on the number of maternal DNMs (Wong et al., 2016; Goldmann et al., 2016), with an additional ∼0.4 mutations accrued per year. Given the lack of mitotic cell division in oocytes after birth, this observation indicates that by typical reproductive ages, at least half of maternal DNMs arise from DNA damage (Jónsson et al., 2017). Second, despite highly variable rates of germ cell division over human ontogenesis, germline mutations appear to accumulate with absolute time in both sexes, resulting in a ratio of paternal-to-maternal germline mutation, α, of around 3.5 at puberty and very little increase with parental ages (Gao et al., 2019). Third, studies in a dozen other mammals suggest that α ranges from 2 to 4 whether the species reproduces months, years, or decades after birth (Wu et al., 2020; Wilson Sayres et al., 2011; Wang et al., 2022a), when estimates of germ cell division numbers at time of reproduction would predict a much wider range in α (Drost and Lee, 1995; Lindsay et al., 2019; Harland et al., 2017; Wu et al., 2020).

Explaining the observations in humans under a model in which most mutations are due to replication errors, and thus track cell divisions, would call for an exquisite balance of cell division and mutation rates across developmental stages in both sexes (Gao et al., 2016). In males, the constant accumulation of mutations with absolute time would require varying rates of germ cell divisions over ontogenesis to be precisely countered by reciprocal differences in the per cell division mutation rates. In females, it would necessitate that the mutation rate per unit of time be identical whether mutations arise from replication errors or damage. In turn, the similarity of α across mammals that differ drastically in their reproductive ages would entail two distinct sources of mutation – replication error in males and damage in females – covarying in tight concert with generation times.

A more parsimonious alternative is that most germline mutations arise from the interplay between damage and repair rather than from replication errors (Seplyarskiy et al., 2021), and that the balance results in more mutations on the paternal than the maternal genome (Gao et al., 2016). Assuming repair is inefficient relative to the length of the cell cycle or, perhaps more plausibly, that repair is efficient but inaccurate (Vilenchik and Knudson, 2003; Abascal et al., 2021), mutations that arise from damage will not track cell divisions (Gao et al., 2016). Damage-induced mutations must underlie the observed maternal age effect on DNMs in humans; they could also account for the accumulation of germline mutations in proportion to absolute time in males, assuming fixed rates of damage and repair machinery errors in germ cells.

Multiple lines of evidence have emerged in support of damage-induced mutations being predominant in the human germline. Analyses of the mutation spectrum in humans indicate that 75% of DNMs and 80% of mutations in adult seminiferous tubules are due to mutation ‘signatures’ SBS5/40 (Rahbari et al., 2016; Moore et al., 2021), which are clock-like, uncorrelated with cell division rates in the soma (Alexandrov et al., 2015; Alexandrov et al., 2020), and also predominant in post-mitotic cell types such as neurons (Lodato et al., 2018; Abascal et al., 2021). In addition, most substitutions in post-pubertal germ cell tumors are attributed to SBS5/40, in both females and males (Oliver et al., 2022). More generally, cell division rates do not appear to be a major determinant of mutation rates across somatic tissues (Blokzijl et al., 2016): notably, post-mitotic neurons accumulate mutations at a similar rate as mitotic somatic cell types that are the product of ongoing cell divisions (Abascal et al., 2021). A decoupling between cell division numbers and mutation burden has also been described in colonic crypts across mammals (Cagan et al., 2022), and in yeast, up to 90% of mutations have been estimated to be non-replicative in origin (Zhou et al., 2021). Altogether, these results suggest an important role, for both germline and soma, of mutagenic processes that accumulate with absolute time, as expected from damage-induced mutations (Gao et al., 2016).

In undermining the prevailing understanding of the paternal bias in human germline mutations, these observations revive the question of how the bias arises, as well as of the influences of life history traits and exogenous or endogenous environments. To investigate them, we took a broad taxonomic view, characterizing the paternal mutation bias across amniotes, including mammals but also birds and snakes, which differ in potentially salient dimensions. As two examples, in birds as in mammals, oogenesis is arrested by birth in females, while spermatogenesis is ongoing throughout male reproductive life (Guraya, 1989; Deviche et al., 2011), but birds have internal testes whereas mammals usually have external testes. In addition, mammals and birds are endotherms, in contrast to ectothermic reptiles such as snakes. More generally, the taxa considered vary widely in their life histories, physiologies, and natural habitats.

Results

Estimating sex differences in germline mutation rates across amniotes

To estimate α in each lineage, we based ourselves on the evolutionary rates at putatively neutrally evolving sites of sex chromosomes compared to the autosomes (Miyata et al., 1987). The more direct approach of detecting DNMs in pedigrees requires them to be available for each species, and in large numbers for the estimates to be precise. In contrast, the evolutionary method is in principle applicable to any set of species with high-quality genome assemblies and a stable sex karyotype. It takes advantage of the fact that at the population level, sex chromosomes spend different numbers of generations in each sex (e.g., the X chromosome spends twice as many generations in females as in males), whereas autosomes spend an equal number in both (Figure 1A). Thus, all else being equal, if there is a paternal mutation bias, an autosome with greater exposure to the more mutagenic male germline will accumulate more neutral substitutions than the X over evolutionary timescales (Figure 1A); the inverse will be true for the autosomes compared to the Z chromosome (Miyata et al., 1987).

Figure 1 with 1 supplement see all

Download asset Open asset

Estimating the paternal bias in mutation from neutral substitution rates of sex chromosomes and autosomes.

(A) On average, the lineage of an X chromosome spends fewer generations in males than females. Given a higher mutation rate in males than in females and all else being equal, this leads to lower rates of neutral substitutions on the X chromosome compared to autosomes (Miyata et al., 1987). (B) Procedure for estimating the ratio of paternal-to-maternal mutation rates, α, from substitution rates in sex chromosomes and autosomes. The autosomes and the X chromosome are partitioned into 1 Mb windows, depicted in purple and orange, respectively. Each window is filtered to focus on putatively neutrally evolving sequences (see Selecting non-repetitive and putatively neutral sequences in Materials and methods), and its GC content is calculated (represented by shading). The putatively neutral substitution rates per window are then regressed against the GC content (center panel, see Estimating α from X-to-autosome substitution rate ratios in Materials and methods). Substitution rate estimates for the X chromosome and autosomes are obtained from the regression fit (red points). Finally, the ratio of the point estimates is converted to an estimate of α (right panel). An analogous procedure applies to comparisons of the Z chromosome and autosomes in a ZW sex determination system.

Such evolutionary approaches have been widely applied, but until recently they were limited in the number of loci or species (e.g., Shimmin et al., 1993b; Huang et al., 1997; Pecon Slattery and O’Brien, 1998; Ellegren and Fridolfsson, 1997; Carmichael et al., 2000; Nachman and Crowell, 2000) and did not take into account the influence of sex differences in generation times on the estimation of α (Wilson Sayres et al., 2011). An additional complication to consider is that X (Z) and autosomes differ not only in their exposures to male and female germlines but in a number of technical and biological features (notably, GC content) that may need to be controlled for (Shimmin et al., 1993a; Pink and Hurst, 2010; Agarwal and Przeworski, 2019). Moreover, analyses involving closely related species can be confounded by the effects of ancestral polymorphism: for example, lower ancestral diversity in the X chromosome relative to the autosomes reduces the X-to-autosome divergence ratio, leading to overestimation of α (Presgraves and Yi, 2009; Figure 1B). In birds, unresolved branches within the phylogeny present an additional difficulty in estimating substitution rates (Jarvis et al., 2014; Reddy et al., 2017).

Here, we designed a pipeline for estimating the paternal mutation bias systematically across a wide range of species, mindful of these issues. To these ends, we employed existing whole genome alignments (Zoonomia Consortium, 2020; Feng et al., 2020) or produced our own (for snakes, see Sequence alignments in Materials and methods), focusing on assemblies with high quality and contiguity and, where possible, those based on a homogametic individual. To handle the confounding effects of ancestral polymorphism on divergence, we thinned species in the phylogeny to ensure a minimum level of divergence between them, relative to polymorphism levels (see Species selection criteria in Materials and methods). This stringent filtering procedure resulted in three whole genome alignments including 20 mammals, 17 birds, and 5 snake species, respectively (Supplementary file 2).

In order to estimate neutral substitution rates from the alignments and compare X (Z) and autosomes while minimizing confounding factors, we focused on non-repetitive, non-exonic regions that were orthologous across all species in an alignment and did not overlap with pseudo-autosomal regions (PARs) with orthologs on the Y (W) chromosome (see Selecting non-repetitive and putatively neutral sequences in Materials and methods; see Figure 2—figure supplement 1F for a more stringent masking of all conserved regions). To account for differences between X (Z) and autosomes in features other than their exposure to each sex, we regressed putatively neutral substitution rates in the 1 Mb genomic windows against GC content and GC content squared (Figure 1B). We took this approach because GC content is readily obtained from any genome sequence and is highly correlated with known modifiers of the mutation rate such as replication timing and the fraction of CpG dinucleotides (Koren et al., 2012; Agarwal and Przeworski, 2019). In principle, X chromosome inactivation could also influence relative substitution rates on X versus autosomes, but in the germline, it is short-lived: limited in mice and humans to early embryogenesis in females and brief meiotic and post-meiotic periods in males (Chuva de Sousa Lopes et al., 2008; Guo et al., 2015). We obtained substitution rate estimates for the X (Z) chromosome and autosomes from the regression fit. Finally, we inferred α for the terminal branches leading to the 42 amniote species from the ratio of the substitution rate estimates for the X (Z) versus the autosomes (Figure 2), taking into account sampling error as well as uncertainty in the ratio of paternal-to-maternal generation times (Amster and Sella, 2016) (see Estimating α from X-to-autosome substitution rate ratios in Materials and methods).

Figure 2 with 3 supplements see all

Download asset Open asset

Estimates of the paternal bias in mutation across 42 amniote lineages.

Colored points denote estimates of α from X (Z)-to-autosome substitution rate ratios ( ${\hat{α}}_{evo}$ ) in mammals (top, orange), birds (middle, blue), and snakes (bottom, green). Vertical colored lines denote the mean ${\hat{α}}_{evo}$ for each group, while the vertical gray dashed line denotes $α = 1$ (i.e., no sex bias in mutation). Species in each group are plotted by their phylogenetic relationships and branch lengths are scaled by the neutral substitution rate estimated from autosomes (see Estimating putatively neutral substitution rates in Materials and methods). Note that branch lengths are comparable within the phylogeny of each taxon but not across taxa, as the scaling differs (see the legend for each group). In mammals, ${\hat{α}}_{evo}$ was estimated from neutral substitutions along the lineage from the tip to the most recent common ancestor indicated in the phylogeny. In birds, where phylogenetic relationships are more tenuous, we divided species into six subgroups (Supplementary file 5) to avoid highly uncertain ancestral nodes in Neoaves; thus, some ${\hat{α}}_{evo}$ estimates in Neoaves average over deeper splits than suggested by the full phylogeny, which we plot for clarity. Asterisks indicate species with chromosome-level assemblies. Darker colored horizontal lines behind the points represent 95% CIs, which were computed by bootstrap resampling of the 1 Mb genomic windows across 500 replicates; the central 95% interval across bootstrap replicates is shown. Lighter colored horizontal lines include uncertainty in the ratio of paternal-to-maternal generation times, allowing the ratio to range between 0.9 and 1.1 (Amster and Sella, 2016). Short vertical red lines denote point estimates of ${\hat{α}}_{dnm}$ from published pedigree mutation studies of de novo mutations, and the surrounding horizontal gray boxes represent the 95% binomial CI for those estimates.

Overall, our evolutionary-based estimates, ${\hat{α}}_{evo}$ , are consistent with estimates from pedigree sequencing studies, ${\hat{α}}_{dnm}$ (Figure 2). Notably, and reassuringly, the point estimates for species with the largest amount of available DNM data (e.g., humans, mice, and cattle) are in very close agreement. Even in the absence of estimation error, this concordance is not necessarily expected, as ${\hat{α}}_{evo}$ is an average over many thousands of generations of evolution, whereas estimates from DNMs are based on small numbers of families at present. In principle, differences between the estimates could therefore arise if α evolves rapidly (as may have happened in the lineage leading to macaque), or if the ages of the parents in the pedigree are quite unrepresentative of average paternal-to-maternal generation times in evolution (Figure 2; Amster and Sella, 2016). The general concordance between ${\hat{α}}_{evo}$ and ${\hat{α}}_{dnm}$ therefore suggests that the evolutionary approach is providing reliable estimates and the paternal bias in mutation is not rapidly evolving.

Nonetheless, it is unlikely that our regression model perfectly accounts for all the genomic features that differ between sex chromosomes and autosomes other than exposure to sex. Remaining disagreement between ${\hat{α}}_{evo}$ and ${\hat{α}}_{dnm}$ could therefore also arise from mutation rate modifiers that differentially affect sex chromosomes and autosomes. For example, in cats, the low ${\hat{α}}_{evo}$ compared to ${\hat{α}}_{dnm}$ (Wang et al., 2022b) could be due to unusual features of the X chromosome: the feline X chromosome is known to harbor a large recombination coldspot spanning over 50 Mb (Li et al., 2016), visible in its effects on GC substitution rates (Figure 2—figure supplement 2; Meunier and Duret, 2004), and these features may have influenced the rate of substitution of the X chromosome relative to the autosomes.

A paternal bias in mutation is widespread in amniotes

A paternal bias in mutation is seen across amniotes, with a range of 1–4 in the species considered (Figure 2). The ${\hat{α}}_{evo}$ estimates remain similar if we exclude hypermutable CpG sites (Figure 2—figure supplement 1B), or focus only on mutation types that are not subject to the effects of GC-biased gene conversion (gBGC) (Figure 2—figure supplement 1F and Figure 2—figure supplement 3). Although the absolute magnitude of ${\hat{α}}_{evo}$ exhibits some sensitivity to different choices of conservation filters (e.g., excluding all conserved regions, not just exons) and different substitution types, ${\hat{α}}_{evo}$ are robustly above 1 and their ranking across species remains similar across different filtering criteria (see Figure 2—figure supplement 1 for details). The robustness of ${\hat{α}}_{evo}$ across conditions and filters confirms that, while our pipeline may not account for all the differences between autosomes and X (Z) chromosomes unrelated to sex differences in mutation, the qualitative patterns are reliable. These results therefore establish that the paternal bias in mutation is not a feature of long-lived humans or of mammals, but is instead ubiquitous across species that vary markedly in their gametogenesis, physiology, and life history.

The effects of gBGC track recombination rates and result in greater selection for GC in regions of higher recombination. Therefore, if α is similar for different types of DNMs, as has been found in humans (Jónsson et al., 2017; Gao et al., 2019), the greater population recombination rate of autosomes relative to the sex chromosomes should lead the the X-to-autosome substitution rate ratio of gBGC-favored mutation types (T>C and T>G) to be somewhat lower than that of mutation types unaffected by gBGC (C>G and T>A). Consistent with expectation, ${\hat{α}}_{evo}$ estimates in mammals using only gBGC-favored mutation types are inflated relative to estimates from mutation types unaffected by gBGC (Figure 2—figure supplement 3). Also as expected, bird and snake species with ZW sex determination exhibit the opposite pattern (i.e., a deflated ratio of Z-to-autosome substitution rate leads to a decreased estimate of ${\hat{α}}_{evo}$ ; Figure 2—figure supplement 3). The behavior of the different mutation types therefore provides a further sanity check on our estimates. While the estimation of ${\hat{α}}_{evo}$ could be further partitioned into single mutation classes, such estimates are noisier and – given the lack of ground truth – harder to interpret; we therefore focused on α for all substitution types combined.

Within mammals, the mean value of ${\hat{α}}_{evo}$ is 2.7, with a range 1.0–4.1 and a coefficient of variation of 0.29. In birds, ${\hat{α}}_{evo}$ is lower on average but also seemingly more stable, ranging from 1.5 to 2.7 (mean = 1.8, coefficient of variation = 0.19). In the handful of snake species sampled, the mean is similar to that of birds and ${\hat{α}}_{evo}$ ranges from 1.3 to 2.2 (mean = 1.7, coefficient of variation = 0.23), in agreement with a previous evolutionary estimate for rattlesnake (α=2.0; Schield et al., 2019).

In mammals, variation in α has long been known to be associated with generation times, and has been consistently interpreted as resulting from greater numbers of replication errors in species with longer-lived fathers (e.g., Wilson Sayres et al., 2011; Chang et al., 1994; Li et al., 1996; Li et al., 2002). We confirmed the observation here: after accounting for the phylogenetic relationship between species, mammals reproducing at older ages show a stronger paternal bias in mutation (p-value = 0.01, $r^{2} = 29 %$ ; Figure 3). Statistically significant relationships also exist between ${\hat{α}}_{evo}$ and other life history traits (Figure 3—figure supplement 1), but these traits are strongly correlated with one another (Figure 3—figure supplement 2) and generation time is the strongest single predictor (Figure 3—figure supplement 1; see Testing relationships between α and life history traits in Materials and methods). In contrast, a significant relationship between generation time and ${\hat{α}}_{evo}$ is not seen in birds (p-value = 0.30, $r^{2} = 7 %$ ; Figure 3; Wang et al., 2014), despite similar numbers of species and a comparable range of generation times to mammals. Moreover, we could reject the null model of a slope in birds equal to or greater than that of mammals (p-value = $10^{- 5}$ ). (Given the paucity of generation time and α estimates for snakes, we could not test the relationship in reptiles.) Given recent evidence that most mutations depend on absolute time and not cell division rates, the standard explanation for this generation time effect no longer holds. These observations therefore raise the question of how else the relationship between generation times and α in mammals can be explained.

Figure 3 with 2 supplements see all

Download asset Open asset

Relationship between ${\hat{α}}_{evo}$ and generation time estimates in mammals and birds.

Estimates of α from X (Z)-to-autosome comparisons are plotted against generation times from the literature (see Supplementary file 2), on a log scale. Lines denote the phylogenetic generalized least squares regression fits in mammals (orange) and birds (blue). λ refers to Pagel’s λ (Pagel, 1999), a measure of the strength of phylogenetic signal, which was inferred via maximum likelihood (see Testing relationships between α and life history traits in Materials and methods). Fixing λ to 1 in birds, as estimated for mammals, did not meaningfully improve the fit (p-value =0.282, $r^{2} = 0.08$ ).

A cell-division-independent explanation for the correlation between α and generation time

In eutherian mammals, embryo development is likely independent of sex until primordial germ cell (PGC) specification and subsequent development of the gonads (Lin and Capel, 2015). As a result, mutations arising during early embryogenesis (Early) are expected to occur at a similar rate in males and females ( $α_{E a r l y} = 1$ ), as has been inferred in the few pedigree studies in which DNMs during parental early embryogenesis are distinguished from mutations later in development, namely in humans (Sasani et al., 2019), cattle (Harland et al., 2017), and mice (Lindsay et al., 2019; Figure 4A). While sex differences in early development may exist (Engel, 2018), differences in male and female mutation rates at such an early stage are likely modest in mammals (Spiller et al., 2017; Hancock et al., 2021). At some point after sexual differentiation of the germline, however (in what we term the Late stage) mutation rates in the two sexes need no longer be the same: sources and rates of DNA damage could differ between germ cells, as could the efficiency and accuracy of repair. Indeed, human fathers that recently reached puberty contribute over three times more mutations than similarly aged mothers (Gao et al., 2019). Intriguingly, the magnitude of paternal bias for mutations that occurred long after sexual differentiation of the PGCs appears to be similar in mice, cattle, and humans, at approximately 4:1 (Lindsay et al., 2019; Harland et al., 2017; Sasani et al., 2019; Figure 4A).

Figure 4 with 2 supplements see all

Download asset Open asset

Variation in α among mammals may reflect varying exposures to different developmental stages.

(A) Ratio of paternal-to-maternal de novo mutations (DNMs) occurring in early embryogenesis (*Early*, white points), after the sexual differentiation of the germline (*Late*, gray points) and in both of these stages combined (*Total*, red line), for the three mammalian species in which this classification is available (mouse Lindsay et al., 2019, cattle Harland et al., 2017, and human Sasani et al., 2019). For each species, the percentage of DNMs occurring at each stage are indicated and used to scale the size of points. Vertical lines show the 95% binomial CIs. Since the phasing rate is not equal across developmental stages, point estimates for α in *Total* were computed by extrapolating the proportion of paternally and maternally phased DNMs in each stage to all the DNMs in that stage (i.e., assuming full phasing) (see Estimating α from pedigree studies in vertebrates in Materials and methods). (B) Schematic representation of a model in which α is the outcome of mutation in two developmental stages (see Modeling the effects of germline developmental stages on α in Materials and methods). (C) Expected relationship between α and generation time under the model outlined in B, assuming generation times are the same in both sexes. The increase of α with generation time depends on the paternal mutation rate per year in *Late*, $μ_{m}$ , as illustrated by the purple gradient. (D) Fits of predicted α values to ${\hat{α}}_{evo}$ (orange) and ${\hat{α}}_{dnm}$ (gray). In each species, α is predicted with Equation 1 assuming $M_{e} = 1.66 \times 10^{- 9}$ and using $μ_{f}$ and $μ_{m}$ , the latter estimated from autosomal branch-specific substitution rates per year ( ${\hat{α}}_{evo}$ ) or as estimated from pedigree sequencing data ( ${\hat{α}}_{dnm}$ ) (see Modeling the effects of germline developmental stages on α in Materials and methods). The orange and gray lines denote the regression fit using phylogenetic generalized least squares (PGLS). PGLS statistics are shown for the two models (see Figure 3 legend for details).

In light of these observations, we considered a simple model in which α in mammals is the outcome of two developmental stages with distinct ratios of paternal-to-maternal mutations. In the Early stage until germline sex differentiation, we assumed a paternal-to-maternal mutation ratio of 1 and an expected number of mutations ( $M_{e}$ ) on par with what is observed in humans (i.e., 5 mutations per haploid genome; Sasani et al., 2019; Jonsson et al., 2021; Ju et al., 2017; Figure 4B). In the Late developmental stage after germline sex differentiation, which varies in length among species, we assumed mutations arise at a constant rate per year, $μ_{s}$ in sex $s$ ( $s \in {f, m}$ ). If we assume the length of Early to be negligible relative to the generation time, $G_{s}$ in sex $s$ , then the expectation of α can be written as:

α = \frac{M_{e} + μ_{m} G_{m}}{M_{e} + μ_{f} G_{f}} .

If the ratio $μ_{m} / μ_{f}$ is 4 across species, as suggested by DNM data (Lindsay et al., 2019; Harland et al., 2017; Sasani et al., 2019; Figure 4A and Supplementary file 3), this model yields a relationship between α and generation time bounded below by 1 and with a plateau at 4, assuming the same generation times in the two sexes (Figure 4C); more generally, the height of the plateau depends on the ratio of paternal-to-maternal generation times (Figure 4—figure supplement 1). The rapidity with which α reaches this asymptote is determined by the magnitude of $μ_{m}$ (and $μ_{f}$ ) in the Late stage (Figure 4C). Most pertinent, a positive relationship between α and the sex-averaged generation time is expected as long as $μ_{m} G_{m} > μ_{f} G_{f}$ .

Using this model, we then predicted α for the terminal branches in the mammalian tree. To estimate the number of mutations occurring in Late for each branch, we used the evolutionary rates in Figure 2A. Specifically, we calculated a sex-averaged substitution rate per generation by multiplying the autosomal yearly substitution rate in each branch ( $μ_{y}$ ) by a generation time estimate for its tip (Supplementary file 2). Given a fixed ratio of paternal-to-maternal mutation rates of 4 in the Late stage, the mutation rate for each sex can be calculated for any given ratio of paternal-to-maternal generation times:

μ_{f} = \frac{μ_{y} (G_{f} + G_{m}) - 2 M_{e}}{G_{f} + 4 G_{m}} .

From the parental mutation rates and assuming a fixed $M_{e}$ , we obtained an estimate of α that we can use to predict ${\hat{α}}_{evo}$ using Equation 1 (see Modeling the effects of germline developmental stages on α in Materials and methods). This model explains a significant proportion of the variance in ${\hat{α}}_{evo}$ in mammals ( $r^{2} = 37 %$ ; p-value = 0.005; Figure 4D). After taking into account sampling error in our ${\hat{α}}_{evo}$ estimates (see Modeling the effects of germline developmental stages on α in Materials and methods), it explains 42% of the variance in α across species. Moreover, the fit of the model remains good regardless of the precise number of Early mutations assumed (see Modeling the effects of germline developmental stages on α in Materials and methods). The two clear outliers are carnivores, for which ${\hat{α}}_{evo}$ may be an underestimate, given the higher estimate from DNMs in cats (Figure 2).

These predictions rely on evolutionary estimates that are uncertain, due for instance to inaccuracies in split time estimates and the use of contemporary generation times as proxies for past ones. If we instead predict α using parameters derived from pedigree data in the nine mammalian species for which at least 30 DNMs have been phased and more than one trio has been studied (Modeling the effects of germline developmental stages on α in Materials and methods), the model explains 86% of the variance in ${\hat{α}}_{dnm}$ (p-value = $3 \times 10^{- 4}$ ; Figure 4D). We caution that this assessment is based on few phylogenetically independent contrasts, however, and so while the fit of the model again appears quite good, the variance explained may be deceivingly high.

In any case, this phenomenological model clarifies that the increased α seen in long-lived mammals may simply reflect a reduction in the fraction of early embryonic mutations relative to total number of mutations per generation – consistent with the higher proportion of Early mutations in mice and cattle compared to humans (Figure 4A). This model can also explain the only modest increase in α with parental ages observed in humans (Gao et al., 2019).

Given this explanation for the effect of generation times on α in mammals, why is a relationship not seen in birds (Figure 2)? One interpretation is simply a lack of statistical power: since the ratio of paternal-to-maternal age effects in the Late stage is lower in birds than in mammals (around 2 instead of 4), under our model, bird generation times would influence α within a narrower range (i.e., between 1 and 2). Alternatively, the lack of a relationship between α and generation times in birds could reflect their distinct germ cell development: Unlike mammals, avian sexual phenotype is directly determined by the sex chromosome content of individual cells (Zhao et al., 2010; Ioannidis et al., 2021) and PGCs are determined by inheritance of maternally derived gene products (Extavour and Akam, 2003). Given these features, it seems plausible that sex differences in mutation rates appear earlier in ontogenesis in birds than in mammals, consistent with reported sex differences in the cellular phenotypes of PGCs prior to gonad development (Soler et al., 2021). If indeed the mutation rate in the two bird sexes differs from very early on in development (i.e., if term $M_{e} \approx 0$ in Equation 1), then assuming a fixed ratio of paternal-to-maternal generation times, our model predicts the sex-averaged age of reproduction will have little to no influence on α.

Discussion

Analyzing diverse species with the same pipeline, we found that, far from being a feature of species with long-lived males, a paternal bias in germline mutation is ubiquitous across amniotes that differ markedly in their life history, physiology, and gametogenesis. Moreover, by considering the different development stages over which germline mutations arise, we provide a new and simple explanation for variation in the degree of sex bias across mammals that does not require dependence on the number of cell divisions.

These findings do not explain why male germ cells accumulate more mutations than female ones, however. Given that the paternal bias varies little across species exposed to disparate physical environments, and presumably distinct exogenous mutagens, the proximate causes of the paternal bias are likely sex differences in endogenous sources of DNA damage or in repair mechanisms. For instance, the effects of reactive oxygen species, a major source of DNA damage, may be greater in male germ cells than in oocytes (Smith et al., 2013; Rodríguez-Nuevo et al., 2022). In turn, the evolutionary cause of the paternal bias could be related to the different evolutionary pressures acting on each sex of anisogamous species, for example due to greater competition among sperm than among oocytes.

Another question raised by our findings is why, after sexual differentiation of the germline, mutation appears to be more paternally biased in mammals (∼4:1) than in birds and snakes (∼2:1). In that regard, it will be of interest to collect pedigree data from these taxa, with which to compare mutation signatures to those typically seen in mammals.

Beyond these questions, our findings suggest a change of focus, reframing sex differences in germline mutation rate as part of a broader puzzle: why certain cell types accrue more mutations than others. In that regard, it is intriguing that the relative mutation rates of different cell types seem similar across mammals. The balance of damage and repair results in an approximately fourfold higher mutation rate in spermatogonia compared to oocytes across mammalian species (Figure 4A). Similarly, comparing yearly mutation rates in colonic crypts (Cagan et al., 2022) to estimates for spermatogonia, the ratio of crypt-to-sperm mutation rates appears relatively stable across four mammalian species (Figure 4—figure supplement 2). This observation suggests that, beyond spermatogonia and oocytes, the relative mutagenicity of different cell types may be conserved across mammals. Mutation rates in different cell types could be coupled over time either because of natural selection to maintain specific rates in each cell type or because changes to the repair machinery in some cell types (potentially, germ cells) have pleiotropic consequences on mutation rates in others. Regardless, our observations point to a role of natural selection in maintaining the relative rates at which mutations accumulate in different cell types over long evolutionary timescales.

Materials and methods

Sequence alignments

Request a detailed protocol

In mammals, we obtained sequence alignments from the 241-way multi-alignment generated by the Zoonomia Project (https://zoonomiaproject.org/) (Zoonomia Consortium, 2020). To assess the effect of reference sequence selection on our α estimates, we considered two alignments, one using the Homo sapiens genome as reference sequence and the other using the Mus musculus genome as reference (Figure 2—figure supplement 1A).

In birds, we subdivided the 363-way alignment generated by the B10K project (https://b10k.genomics.cn/) (Feng et al., 2020) into six subgroups, avoiding the inclusion of ancestral nodes with high uncertainty within Neoaves (Feng et al., 2020; Jarvis et al., 2014; Prum et al., 2015). Since a species topology is required to accurately infer branch-specific substitution rates, we built species sets by combining monophyletic groups that are well supported across data types and studies (Reddy et al., 2017; Supplementary file 4). In all cases, we used the Gallus gallus genome as the reference sequence.

In snakes, we built our own multiple genome alignments using whole genome assemblies downloaded from the National Center for Biotechnology Information (NCBI) database (Supplementary file 1). To speed up computation, we removed repetitive regions – which are ignored in all downstream analyses – from the whole genome FASTA files prior to alignment by converting lowercase bases (i.e., a, t, c, g) to N bases. We ran the Cactus program (v1.2.5, https://github.com/ComparativeGenomicsToolkit/cactus, Hickey, 2022a) to align the genomes in each clade using topologies generated by TimeTree as our guide trees (see trees/Snakes.TimeTree.nwk at https://github.com/flw88/mut_sex_bias_amniotes/; de Manuel, 2022). For subsequent analyses, we used Thamnophis elegans as the reference sequence in snakes.

For each taxon, we converted the HAL file into a Multiple Alignment Format (MAF) file and split the alignment into non-overlapping windows of 1 Mb using the hal2maf tool in halTools (https://github.com/ComparativeGenomicsToolkit/hal/; Hickey, 2022b):

hal2maf $hal $maf --targetGenomes $species_list --refGenome \

$reference --refSequence $reference_chrom --start $start \

--length $end-$start --onlyOrthologs --noDupes --noAncestors.

Species selection criteria

Request a detailed protocol

To estimate α, we aimed to measure differences in the rates of neutral substitution in X (Z) versus autosomes that are directly attributable to differences in the mutation rate of males and females. However, X (Z) and autosomes also differ in a number of other technical and biological features that must first be taken into account.

One important source of technical bias is the unequal sequence coverage of the X (Z) and autosomes in heterogametic individuals. To minimize any potential issues due to systematic differences in assembly quality between X (Z) and autosomes, we excluded non-chromosome level genomes known to be assembled exclusively from DNA of the heterogametic sex. In addition, we discarded any species belonging to a genus in which a complex system of chromosomal sex determination has been identified (annotated as ‘complex XY’ or ‘complex ZW’ in the Tree of Sex database https://coleoguy.github.io/tos/data.vert.csv, with the exception of the Mus genus). In mammals, out of a total of 241 genomes, this approach led us to exclude 50 male-based assemblies and nine species with at least one case of complex XY in the same genus. In birds, out of a total 363 genomes, we excluded 186 female-based assemblies and two species with at least one species with a complex ZW in the same genus.

The quality of the genome assembly is an additional potential confounder. Given that we relied on higher quality, chromosome-level assemblies to categorize alignments as X (Z) or autosomal, we would be more likely to miscategorize alignments (i.e., as X/Z or autosomal) in species with lower quality genome assemblies that are highly diverged from the nearest chromosome-level assembly. To address this issue, in mammals, we removed species if their genomes were >15% diverged from the nearest chromosome-level assembly. We relaxed the divergence threshold to 30% in birds, in which fewer genomes are assembled at chromosome-level and across which karyotypes are believed to be relatively stable (Ellegren, 2010). In both mammals and birds, we relied on published divergence estimates inferred from the same multi-alignments used in this study (see mammals Zoonomia Consortium, 2020 and birds Feng et al., 2020). We also discarded species with low-quality scaffold-level assemblies, that is, where scaffold N50 < 350 kb and contig N50 < 25 kb. These filters led to the removal of 120 and 76 species in mammals and birds, respectively.

Given the paucity of genomes in snakes, we relaxed our filtering criteria to allow the inclusion of a larger number of species. Specifically, we allowed scaffold-level assemblies from the heterogametic sex and reduced the scaffold and contig N50 thresholds to 100 and 10 kb, respectively. These changes allowed the inclusion of Vipera berus and Pantherophis obsoletus. We estimated divergence between species using phyloFit (see Estimating putatively neutral substitution rates) in the largest chromosome in Thamnophis elegans (NC 045541.1). As in mammals, we removed any species with distance to nearest chromosome-level assembly > 15% and confirmed that none of the species belong to a genus with a complex ZW system in the Tree of Sex database. This procedure excluded one of the nine snake species (Laticauda laticaudata).

Another important consideration comes from the differing evolutionary histories of sex chromosomes and autosomes. Under neutrality and assuming equal variances in reproductive success, the X (Z) chromosome is expected to have a lower effective population size, $N_{e}$ , than the autosomes (Amster and Sella, 2020). For closely related species, this implies a deeper coalescence time of autosomes than X (Z) in their ancestral population and therefore an unequal contribution of ancestral polymorphisms to the substitution rates; for example, if $N_{e}^{X} < N_{e}^{A}$ , then the X-to-autosome substitution rate ratio will be deflated relative to the expectation under mutational male bias alone, and consequently α will be overestimated (Presgraves and Yi, 2009). To minimize this problem, we sought to keep a subset of species that were sufficiently distantly related such that the contribution of ancestral polymorphism to divergence is small and the bias in α estimates is negligible. Specifically, we proceeded as follows: under simplifying assumptions, the expected neutral divergence attributable to ancestral polymorphisms is given by the heterozygosity, π, in the ancestral species. Since π in the ancestral population of a species pair is unknown, we used estimates for π from present-day species as a proxy. We pruned the phylogeny of each taxon so to retain only species pairs with a combined (summed) substitution rate of at least $15 π$ , where π is the higher value of the pair.

We collected mammalian π estimates from the individual heterozygosities in the Zoonomia Project (‘Overall heterozygosity’ in Table S3 in Zoonomia Consortium, 2020), complemented with the nucleotide diversities in Buffalo, 2021a (‘log10_diversity’ in data/combined_data.tsv at https://github.com/vsbuffalo/paradox_variation/; Buffalo, 2021b) obtaining π values for 16 of the remaining mammalian species. For any species lacking a value in both databases, we assigned the π of the closest species in the mammalian phylogeny as inferred with PHAST (45 species). Finally, in one case in which π from both databases were available (Daubentonia madagascariensis), we took the average π.

In birds, we used π estimates in Brüniche-Olsen et al., 2021, obtaining direct estimates for 13 of the remaining species. For species not present in the database, we assigned the π of the closest species in the bird phylogeny (85 species). In snakes, we collected π values from the literature (Supplementary file 1).

Because initiatives like the Zoonomia Project or B10K may preferentially select species at risk of extinction (Zoonomia Consortium, 2020), some of the present-day π values may underestimate the diversity levels in the ancestor. We thus set an extra requirement of at least a combined 2% substitution rate between any pair of species. In species pairs where the rate was below either of these two thresholds ( $15 π$ or 2%), we preferentially retained the species that met the following criteria, considered in this order: (1) more phased DNM count data from pedigree sequencing (count of 0 if not available), (2) a chromosome-level assembly, and (3) a higher scaffold N50. Altogether, 20 out of 241 mammalian species, 17 out of 363 bird species, and 5 out of 9 snakes species remained after the complete filtering procedure.

A list of the species kept after filtering, together with other genome statistics and results from our analyses, can be found in Supplementary file 2. The code to reproduce the filtering procedure described above can be found in notebooks/Filter_species.ipynb at https://github.com/flw88/mut_sex_bias_amniotes; de Manuel, 2022.

Selecting non-repetitive and putatively neutral sequences

Request a detailed protocol

In the absence of natural selection and/or gBGC, the substitution rate is equal to the mutation rate (Kimura, 1983). To minimize the effects of selection, we limited our analyses to non-coding regions by removing all exons annotated in the given reference sequence as well as the 1 kb of sequence flanking each exon. As a check, we also estimated α in mammals and birds after masking conserved elements identified by phastCons (Siepel et al., 2005) (mammals and birds, respectively). Since the ${\hat{α}}_{e v o}$ are similar, we based our analyses on the larger dataset based on masking only exons and their 1 kb flanking sequences.

The effect of gBGC mutation on the substitution process is analogous to that of selection for specific base pairs, in that the process increases the probability of fixation of strong (G/C) over weak (A/T) alleles (Duret and Galtier, 2009). To explore the effects of gBGC, we estimated specific rates for each single-nucleotide substitution type (see Estimating putatively neutral substitution rates for details). To remove the effects of gBGC, we estimated α for the subset of mutation types that are not subject to gBGC (i.e., substitutions from strong to strong and weak to weak nucleotides) (Figure 2—figure supplement 3 and Figure 2—figure supplement 1G).

In addition, to ensure the high quality of the alignment data for analysis, we removed repetitive regions, keeping only those genomic positions at which the reference sequence in a given analysis group (mammals, birds, and snakes) carries an uppercase nucleotide.

Filtering idiosyncratic genomic regions

Request a detailed protocol

We excluded sequences aligned to known PARs in the sex chromosomes, which have homologs on both X and Y (or both Z and W) and thus behave like autosomes in terms of their ploidy (see Supplementary file 5 for PAR definitions). For snakes, we aligned sequenced reads from a female Thamnophis sirtalis individual (NCBI accession SAMN02402779) to the Thamnophis elegans reference genome using BWA-MEM v0.7.17-r1188 (http://bio-bwa.sourceforge.net/), with default parameters. We removed PCR duplicates with the markdup tool in samtools v1.10 (http://www.htslib.org/) and calculated the mean depth of coverage along the Z and the largest autosome in 1 Mb windows using mosdepth (https://github.com/brentp/mosdepth, Pedersen, 2022). We then determined regions of the Z chromosome in which the depth of coverage was significantly different to that in the autosomes, assuming depth is Poisson distributed with λ equal to the mean depth in the autosome,potentially indicative of the region being in a PAR and having homologs on the W chromosome (Figure 1—figure supplement 1).

The genome of birds and snakes are organized into two types of autosomes, macro- and microchromosomes, which differ in their length, gene content, density of hypomethylated CpG islands, recombination rates, and replication timing (Waters et al., 2021). Given the idiosyncrasies of microchromosomes, which may affect the substitution rate estimates (Wang et al., 2014), we excluded sequences aligned to microchromosomes in birds and snakes (chromosomes 10–28 in Gallus gallus and chromosomes 13–18 in Crotalus viridis). The fraction of base pairs in microchromosomes is relatively small, comprising 20% and 5.1% of the autosomal genome in Gallus and Crotalus, respectively. We checked that ${\hat{α}}_{evo}$ are similar whether or not microchromosomes are excluded ( $r > 0.9$ between ${\hat{α}}_{evo}$ estimates obtained after excluding or including microchromosomes, in both birds and snakes, Figure 2—figure supplement 1E).

An additional concern is that genomic translocations between X (Z) and the autosomes could lead to sequence misclassification in species without a chromosome-level assembly. To alleviate this potential issue, we only kept sequences that exclusively mapped to chromosomes of the same kind (i.e., X or Z versus autosome) in all species for which chromosome-level assemblies were available. In other words, we removed all alignments in which chromosome-level assemblies indicated a mapping between an X (Z) sequence of one species with an autosomal or Y (W) sequence of another.

To summarize, each 1 Mb MAF file in each taxon was first filtered with the maf_parse tool in PHAST (http://compgen.cshl.edu/phast/), using a thinned set of species obtained as described in Species selection criteria and a BED file with the regions to be excluded as indicated by the reference genome (i.e., exons ±1 kb and the PARs, if known). The python scripts filter_PARs_micros_CpGs.py and keep_species_XYA-synteny.py (available at https://github.com/flw88/mut_sex_bias_amniotes, copy archived at swh:1:rev:37da9bdbc2c7cb839de15aadb554cf6c98128add; de Manuel, 2022) were then used to filter any gaps, annotated PARs, as well as regions that mapped to known chromosomes of a different kind:

maf_parse --features $regions_to_exclude_bed -M $reference \

--seqs $(cat $species_list_thinned) $maf |

python filter_PARs_micros_CpGs -p data/Species_to_PARs.tsv |

python filter_species_gaps_maf_XYA.py \

-l $species_list_thinned -c data/Species_to_chromosomes.txt \

-b $filtered_regions_bed -a > $filtered_maf

GC content and replication timing estimates

Request a detailed protocol

The framework provided by Miyata et al., 1987, to infer α assumes that the generation time is the same for both sexes, as well as that the substitution rates on autosomes versus X (Z) are solely determined by the sex-specific mutation rates and the ploidy difference between sexes. However, other genomic features, such as GC content and replication timing, are known to differentially influence the mutation rate of sex-linked and autosomal chromosomes (Agarwal and Przeworski, 2019; Koren et al., 2012). To account for these differences, we collected measures of species-specific GC content. Specifically, for every filtered 1 Mb MAF in each taxon, we calculated the fraction of G/C base pairs in each genome with:

cat $filtered_maf | \
python gc_content_from_maf.py -s $species_list_thinned

We additionally obtained replication timing data in human embryonic stem cells from the UCSC genome browser. We converted the data from bigWig format to BED using bigWigToBedGraph and lifted the coordinates from the hg19 reference genome to hg38 using the liftOver tool.

To explore the relationship between replication timing and substitution rates in humans, we calculated an average replication timing value across the unfiltered bases in each 1 Mb window of the mammalian alignment (H. sapiens as reference). Specifically, we used the mean replication timing value weighted by the number of bases associated with each replication timing datum.

Estimating putatively neutral substitution rates

Request a detailed protocol

To estimate putatively neutral substitution rates on X (Z) and autosomes, we used phyloFit a program within the PHAST software suite (Hubisz et al., 2011; Siepel and Haussler, 2004) (http://compgen.cshl.edu/phast/). For every 1 Mb window of aligned sequence in each taxon with ≥10 kb of sequence remaining after filtering, we estimated substitution rates using the general, unrestricted single nucleotide model (--subst-mod UNREST) with the expectation maximization algorithm with medium precision for convergence (--EM --precision MED). We also obtained the number of expected counts at each node for each substitution type (option -Z). For mammals and birds, we used the relevant tree topology defined in the Newick files in http://cgl.gi.ucsc.edu/data/cactus/; for snakes, we used a topology from TimeTree (http://timetree.org/). To avoid local maxima in the likelihood surface, we ran six independent phyloFit runs with random initialization of the parameters (option -r) and kept the replicate with the highest likelihood. We note that phyloFit estimates the expected substitution counts for type A₁ > A₂ by inferring the expected number of times allele A₁ is found at the internal node of a branch in the tree and allele A₂ is observed at the terminal node. However, the overall branch lengths are maximum likelihood estimates of the expected rate of substitution in continuous time along the branches. Thus, the rate of substitution estimated by summing substitution counts and dividing by the genome size is slightly smaller than the maximum likelihood branch-length estimate (as the latter allows back-mutation but the former does not include them).

phyloFit -r --EM --precision MED --subst-mod UNREST -Z \

--msa-format MAF $filtered_maf --tree $newick \

-e $phylofit_errors -o $phylofit_output.

Estimating α from X-to-autosome substitution rate ratios

Request a detailed protocol

We took a regression approach to estimate α from ratios of X (Z)-to-autosome substitution rates. This approach allowed us to control for the effect of GC content $g$ on the substitution rates (see Estimating putatively neutral substitution rates). For each species, we performed a Poisson regression with a log link function on the number of substitutions $Y_{i}$ in the terminal branch (as inferred from phyloFit):

\log [E (Y_{i} | n_{i}, x_{i}, g_{i})] = \log (n_{i}) + β_{0} + β_{1} x_{i} + β_{2} g_{i} + β_{3} g_{i}^{2}

where the subscript denotes the ith window, $n$ denotes the number of bases at which a substitution could have occurred, $x$ is an indicator variable denoting whether the window is on the X (Z) or the autosomes, and the β variables denote the regression coefficients (Supplementary file 6). Modeling the relationship between substitution rate and GC content as a quadratic function captures effects of hypermutable CpG sites via the squared term (Hardison et al., 2003; Hellmann et al., 2005). Note that for the overall substitution count, the number of substitution opportunities $n$ is the total number of sites left in the window after filtering; however, when applying the regression model to a specific substitution type $A_{1} > A_{2}$ , we only considered sites where the ancestral allele was inferred by phyloFit to be A₁ (or its complementary base, see Estimating putatively neutral substitution rates).

We used the fitted regression models to estimate α in each species. To this end, we first obtained point estimates of the substitution rates on the X (Z) and autosomes calculated at the mean GC content values of the X (Z) windows. We then converted the resulting X (Z)-to-autosome substitution rate ratio to an estimate of α using Miyata’s equations (Miyata et al., 1987). This approach implicitly assumes that mutation rates in X (Z) and autosomes differ only with regard to their exposure to sex, once differences in pertinent genomic features are taken into account.

We note that this approach infers α from the ratio of the expectations of the X (Z) and autosomal substitution rates rather than the expectation of the ratios. To check whether that makes a difference, we re-estimated α in each species using a modified procedure in which we repeatedly sampled a pair of X (Z) and autosome windows with GC content values in a narrow range (mean GC content value of the X (Z) chromosome $\pm 1.5 %$ ) and calculated a X (Z)-autosome substitution rate ratio. Estimating α from the mean ratio across 1000 resamples yielded highly similar estimates to those obtained from our regression approach ( $r = 0.93$ across species, Figure 2—figure supplement 1).

To understand whether controlling for replication timing in addition to GC content might affect our α estimates, we modified Equation 3 to include an extra term for the average replication timing of each window t_i (see GC content and replication timing estimates). We applied this modified regression framework to mammals and obtained X-autosome substitution rate ratios for each species at the mean GC content and replication timing values of the X windows. Converting the X-autosome substitution rate ratios to α estimates using Miyata’s equations (Miyata et al., 1987) yielded values that were highly similar to those obtained when controlling for GC content only ( $r > 0.99$ , see Figure 2—figure supplement 1B). Given the observed agreement and the lack of replication timing data for most species, in subsequent analyses, we relied on evolutionary estimates obtained from the regression model described in Equation 3.

To assess the uncertainty in our α estimates, we bootstrap resampled windows on the X (Z) and autosomes 500 times. For each replicate, we fit the regression model and calculated the X (Z)-to-autosome ratio as described above to obtain an empirical distribution from which we could compute the central 95% interval. We note that because of the functional form describing the relationship between α and the X (Z)-to-autosome substitution rate ratio (Figure 1A), confidence intervals on α tend to be wider at larger values of α. In other words, in the regime of large α, a small shift in the X (Z)-to-autosome substitution rate ratio will have a larger impact on the inferred α estimate. We implemented our regression and α estimation framework in the R script, alpha_from_unrest.regression.R.

Although ignored in the original Miyata et al. approach and subsequent applications (e.g., Wilson Sayres et al., 2011; Wang et al., 2014; Schield et al., 2021), recent modeling work shows that sex differences in generation times can also affect the relative ratio of substitution rates on the X (Z) and autosome by altering the amount of time that a sex chromosome lineage spends in males versus females compared to autosomes (Amster and Sella, 2016). Thus, sex differences in generation times modulate how sex biases in mutations are reflected in substitution rates of X (Z) versus autosomes. Unfortunately, sex-specific generation time estimates are rarely available for extant species, let alone ancestral lineages, and likely evolve over time. To incorporate uncertainty in sex differences in generation times, we re-computed our uncertainty intervals on α under the assumption that the male-to-female ratio of the generation times for any particular lineage lies between 0.9 and 1.1, using formulas derived by Amster and Sella, 2016.

Estimating α from pedigree studies in vertebrates

Request a detailed protocol

In order to obtain estimates of α from extant vertebrate species, we identified 14 DNM studies with published counts of parentally phased DNMs (Bergeron et al., 2021; Besenbacher et al., 2019; Campbell et al., 2021; Harland et al., 2017; Jónsson et al., 2017; Lindsay et al., 2019; Smeds et al., 2016; Tatsumoto et al., 2017; Thomas et al., 2018; Wang et al., 2020; Wang et al., 2022a; Wang et al., 2022b; Wu et al., 2020; Yang et al., 2021). For each species in each study, we calculated point estimates of α by dividing the number of DNMs phased to the paternal chromosome by the number phased to the maternal chromosome (Supplementary file 2). We measured uncertainty by computing binomial confidence intervals on the proportion of all phased DNMs that were paternal and then converting the resulting interval bounds back to a paternal-to-maternal ratio.

From this list, we excluded one study from mouse lemur (Microcebus murinus), which reported an anomalously high mutation rate per year for a primate species ( $> 3.5 \times 10^{- 9}$ per site) and unusually low rates of transitions at CpG sites (Campbell et al., 2021). The authors suggested C-to-T substitutions in the branch leading to mouse lemur occurred at a similar rate irrespective of their dinucleotide context (CpG or non-CpG), in contrast to what is seen in other primates (Moorjani et al., 2016). However, analyzing our substitution data, we find the C>T substitution rate in mouse lemur to be over fivefold higher at CpG sites compared to non-CpG sites. Specifically, we estimated substitution rates from our filtered autosomal mammalian alignments as described in Estimating putatively neutral substitution rates with the following modifications: (I) CpG islands, as defined here, were masked following Campbell et al., 2021; (II) CpG dinucleotide substitution rates were estimated using a context-dependent model (--subst-mod U2S). This study also reports the weakest mammalian paternal bias in mutation described to date (α = 1.18). This value is out of sync with reports for other primates and far from what we estimate from substitution rates, ${\hat{α}}_{e v o}$ (Figure 2 and Supplementary file 2). One possibility is that a substantial rate of false positive DNMs biased ${\hat{α}}_{d n m}$ toward 1 (since errors are likely placed with equal probability on the maternal or paternal haplotype). Given the uncertainty surrounding how to interpret these DNM data, we do not include this ${\hat{α}}_{d n m}$ in our analyses.

Estimating α for different developmental stages

Request a detailed protocol

DNM studies typically quantify the number of mutations in the offspring that are not found in some somatic tissue (usually blood) of the parents. This approach can mistakenly include DNMs that occurred in the early development of the offspring, as well as mistakenly exclude DNMs that occurred early in the development of the parents (Gao et al., 2016). DNMs that occurred in early development of the parents can be distinguished by patterns of ‘incomplete linkage’ with nearby informative constitutive heterozygous positions, as well as incomplete transmission to the offspring (Harland et al., 2017; Sasani et al., 2019). Moreover, DNMs that occurred right after or during primordial germ cell specification (PGCS) will not be present in the soma of the parents but may be transmitted to multiple offspring (Sasani et al., 2019; Lindsay et al., 2019).

To examine if α varies across developmental stages, we considered studies that distinguish between DNMs in the early development of the parent (i.e., mutations detectable in the parental soma but showing patterns of ‘incomplete linkage’, as well as DNMs transmitted to multiple offspring), versus DNMs that occurred in later stages after PGCS (i.e., not present in the parental soma and transmitted to a single offspring). Counts for early DNMs were obtained: in mice (Lindsay et al., 2019), where we counted the number of mutations phased to each parental haplotype in ‘Early Embryonic’ and ‘Peri-PGC’ categories (Supplementary Data 1 at https://doi.org/10.1038/s41467-019-12023-w); in cattle (Harland et al., 2017), where we counted mutations classified as ‘Sire Mosaic’ or ‘Dam Mosaic’ (Supplementary Table 1 at https://doi.org/10.1101/079863); and in humans (Sasani et al., 2019), where we counted the number of mutations phased to each parental haplotype in ‘Gonosomal mutations’ and ‘Post-PGCS’ (Tables inhttps://github.com/quinlan-lab/ceph-dnm-manuscript/tree/master/data). DNM counts for phases later in development were obtained from the same publications, under the categories ‘Late post-PGCS’, ‘Sire/Dam non Mosaic’, and ‘Third-generation’ in mice, cattle, and humans, respectively. All three studies also employed strategies to discard DNMs in the early development of the offspring. The combined counts for each species and mutation timings can be found in Supplementary file 3.

Since the paternal bias in mutation varies among developmental stages, as does the fraction of mutations that were successfully phased (Supplementary file 3), simply summing over DNM counts from different stages would result in a biased point estimates of the overall α. We therefore computed α by extrapolating the proportion of paternally and maternally phased DNMs in each stage to all the DNMs identified in that stage (i.e., extrapolating to what would be expected given complete phasing). Given this extrapolation, the measures of uncertainty associated with ‘Total’ are not shown in Figure 4A. For DNMs within a single developmental stage, we calculated binomial confidence intervals, as described above.

Testing relationships between α and life history traits

Request a detailed protocol

In mammals, we collected life history traits from the AnAge database (https://genomics.senescence.info/species/dataset.zip), including maximum longevity, gestation time, adult weight, and birth weight. We also obtained generation time estimates from the literature (Supplementary file 2). Thus, in total, we collected data on five traits. Four species were not represented in the AnAge dataset; in these cases, we substituted the trait values of closely related species of the same genus (see Supplementary file 2 for species substitutions). We additionally performed principal component analysis (PCA) on the four traits, generation time, gestation time, adult weight, and birth weight (Figure 3—figure supplement 2), and treated PC1 and PC2 as meta-traits to be tested alongside the others. Only the 17 mammalian species annotated for all four traits were included in the PCA procedure. The first principal component captured 90% of the variance in the traits and was highly correlated with generation time ( $r^{2} = 86 %$ ). In birds, we focused on the life history trait of generation time, taking estimates from the literature (Supplementary file 2).

To test for relationships between life history traits and α while accounting for phylogenetic non-independence in our data, we used phylogenetic generalized least squares (PGLS) (Grafen, 1989). Ordinary least squares is unsuitable for species trait comparisons, because shared phylogenetic history can create correlation structure in the residuals (Felsenstein, 1985). PGLS addresses this issue by considering the covariance structure of the residuals as a covariate, assuming that the traits evolve under Brownian motion on the phylogeny (Grafen, 1989; Pagel, 1999). We implemented the analysis using the pgls function in the caper R package, which provides the option of fitting Pagel’s λ (Pagel, 1999), a scalar multiplier of the off-diagonal elements of the expected covariance matrix of the residuals. Briefly, λ denotes the amount of phylogenetic ‘signal’ in the data. If λ is 0, there is no phylogenetic signal; when λ is 1, the regression model is equivalent to the method of phylogenetic independent contrasts (PIC) (Blomberg et al., 2012; Felsenstein, 1985; Pagel, 1999). In practice, we found that the pgls R function would occasionally fail to converge or converge on a local maximum during maximum likelihood estimation of λ; to address this issue, we initialized the likelihood optimization algorithm with a variety of starting values for λ and retained the model with the highest overall likelihood, which required a minor modification of the base pgls function from the caper package.

For each predictive trait (Figure 3—figure supplement 1), we used our ${\hat{α}}_{evo}$ estimates from X (Z)-to-autosome comparisons as the response variable and a time-calibrated phylogeny from TimeTree to estimate the covariance matrix (http://timetree.org/). Following what had been done previously to analyze these relationships (Wilson Sayres et al., 2011), we log10-transformed each life history trait prior to performing PGLS. Canis lupus familiaris, Ceratotherium simum cottoni, and Pterocles burchelli were not named in the TimeTree database and so we used split times for Canis lupus, Ceratotherium simum, and Pterocles gutturalis instead, respectively (Supplementary file 2). In all comparisons, we calculated p-values under a model in which λ was set to its maximum likelihood estimate and used default values for the remaining arguments of the pgls program. In birds, in which the MLE for λ was 0, we also considered a model in which λ was fixed at 1. To test whether the slope of the ${\hat{α}}_{e v o}$ versus generation time relationship is the same in birds as in mammals, we performed a modified PGLS regression on the bird data with the slope fixed to the maximum likelihood value obtained for mammals (i.e., slope = 1.20) and the intercept (and λ) as the free parameter. After fitting this model with PGLS, we performed a likelihood ratio test (df = 1) to compare it to an alternative model in which the slope was not fixed (i.e., including intercept, slope, and λ parameters).

Modeling the effects of germline developmental stages on α

Request a detailed protocol

To model variation in α among species, we considered the expected number of mutations that arise in two developmental stages: an early embryonic period, Early, which loosely encompasses the time between the zygote and the sexual differentiation of the germline, and a second period, Late, that refers to the remaining time until reproduction (Figure 4B). In mammals, the expected number of mutations in the Early stage, $M_{e}$ , is approximately the same in both sexes, as observed in the three cases in which there are data (Figure 4A). In the Late stage, we assume mutations arise at a constant rate per year, $μ_{s}$ in sex $s$ ( $s \in {f, m}$ ). If we assume the length of Early to be negligible relative to the generation time, $G_{s}$ in sex $s$ , then the expected number of mutations in sex $s$ equals $μ_{s} G_{s}$ . Therefore, the expectation of the ratio of paternal-to-maternal mutations at reproduction, α, can be obtained using Equation 1.

To predict α in species lacking estimates of the sex-specific mutation rates for the Late stage (i.e., $μ_{m}$ and $μ_{f}$ ), we made two further assumptions, namely that:

The expected number of mutations per base pair $M_{e}$ in the Early stage is constant across species and the same in the two sexes. We used an $M_{e}$ of $1.66 \times 10^{- 9}$ per base pair, which equates to five early embryonic mutations in an haploid genome of 3 Gb. This value was chosen based on observations in humans, notably a study showing that monozygotic twins differ on average by 5.2 mutations that arose between the twinning event and PGCS (1.3 mutations per haploid set of chromosomes) (Jonsson et al., 2021). Given that 75–80% twinning events occur around the 8–16 cell stage (Hall, 2003), approximately four mutations are expected to have arisen during the first few divisions in the embryo (assuming ∼1 extra mutation per cell division; Ju et al., 2017). This rate is also in rough agreement with a pedigree study in humans, which estimated that ∼5% of DNMs arise during early development (Sasani et al., 2019). Varying the expected number from 3 to 7 yielded similar results (see below for more details).
The ratio $μ_{m} / μ_{f}$ is fixed across species. We assumed a ratio of 4, consistent with the ratio of paternal-to-maternal DNMs occurring post-PGCS in humans (Sasani et al., 2019), mice (Lindsay et al., 2019), and cattle (Harland et al., 2017) ( Figure 4A).

Using derivations from Amster and Sella, 2016, the yearly substitution rate $μ_{Y}$ for a given lineage is:

μ_{Y} = \frac{2 M_{e} + μ_{f} G_{f} + μ_{m} G_{m}}{G_{f} + G_{m}} .

If $μ_{m} / μ_{f} = 4$ and $M_{e}$ is known, we can solve for $μ_{f}$ using Equation 2 and α can be estimated using Equation 1.

We used the PGLS method described in ‘Testing relationships between α and life history traits’ to assess the fit of α values predicted by our model to the α values estimated from X-to-autosome comparisons ( ${\hat{α}}_{evo}$ ) and from DNM studies ( ${\hat{α}}_{dnm}$ ) (Figure 4C). We applied the model to mammals using estimates of $G$ from the literature (Supplementary file 2). When testing the fit of the model to ${\hat{α}}_{evo}$ , we estimated $μ_{Y}$ by dividing the autosomal substitution rates in a lineage (see Estimating putatively neutral substitution rates) by the split time for that lineage reported in the TimeTree database (http://timetree.org/). When testing the fit to ${\hat{α}}_{dnm}$ , we obtained α from yearly mutation rates obtained from pedigree sequencing studies, given the parental ages in the study (see Supplementary file 2). We note that ${\hat{α}}_{dnm}$ can be noisy if not based on a large amount of DNMs and trios. To overcome this limitation, we focused on species with at least 30 phased DNMs and more than one trio sequenced (which excluded three species out of 14, namely Pongo abelii, Callithrix jacchus, and Ursus arctos, see Supplementary file 2).

We note the model remains a significant predictor for a range of $M_{e}$ values. As examples, using a λ of 1, as inferred by maximum likelihood in Figure 4D, for an $M_{e} = 1 \times 10^{- 9}$ , the model for ${\hat{α}}_{evo}$ explains $r^{2} = 0.33$ (p-value $= 0.008$ ) and for ${\hat{α}}_{dnm}$ , $r^{2} = 0.90$ (p-value $= 1 \times 10^{- 4}$ ). Instead using $M_{e} = 2.33 \times 10^{- 9}$ , the model for ${\hat{α}}_{evo}$ accounts for $r^{2} = 0.35$ (p-value $= 0.006$ ) and for ${\hat{α}}_{dnm}$ , $r^{2} = 0.79$ (p-value $= 0.001$ ).

Following (Kong et al., 2012), we sought to determine the extent to which variation in ${\hat{α}}_{evo}$ in mammals is attributable to sampling error. To that end, we made use of the empirical distribution of ${\hat{α}}_{evo}$ , which we obtained by bootstrap resampling genomic windows (see Estimating α from X-to-autosome substitution rate ratios). For each bootstrap replicate, we regressed the α estimates against our original ${\hat{α}}_{evo}$ using ordinary least squares and obtained the $r^{2}$ value. Across the 500 bootstrap replicates, the median $r^{2}$ value was 89%, suggesting that 11% of the variance in ${\hat{α}}_{evo}$ is due to sampling error. Combining this value with the estimated proportion of variance in ${\hat{α}}_{evo}$ explained by our model yielded an estimate of $37 % / 89 % = 42 %$ of the variance explained after accounting for sampling error.

The code to reproduce the modelling described above can be found in the scripts/2exposure_model.ipynb Jupyter notebook.

Data availability

All source data and scripts to reproduce the findings in the manuscript can be found at https://github.com/flw88/mut_sex_bias_amniotes, (copy archived at swh:1:rev:37da9bdbc2c7cb839de15aadb554cf6c98128add).

The following previously published data sets were used

1. Armstrong J
2. Hickey G
3. Diekhans M
4. Fiddes IT
5. Noval AM
6. Deran A
7. Fang Q
8. Xie D
9. Feng S
10. Stiller J
11. Grenereux D
12. Johnson J
13. Marinescu VD
14. Alföldi J
15. Harris RS
16. Lindblad-Toh K
17. Haussler D
18. Karlsson E
19. Jarvis ED
20. Zhang G
21. Paten B
(2020) UC Santa Cruz Computational Genomics Lab & Platform
ID /data/cactus/. Mammal and bird whole-genome alignment.

https://cglgenomics.ucsc.edu/data/cactus/
1. Sasani T
2. Quinlan A
(2019) github
ID quinlan-lab/ceph-dnm-manuscript. Human de novo mutations.

https://github.com/quinlan-lab/ceph-dnm-manuscript

References

1. Abascal F
2. Harvey LMR
3. Mitchell E
4. Lawson ARJ
5. Lensing SV
6. Ellis P
7. Russell AJC
8. Alcantara RE
9. Baez-Ortega A
10. Wang Y
11. Kwa EJ
12. Lee-Six H
13. Cagan A
14. Coorens THH
15. Chapman MS
16. Olafsson S
17. Leonard S
18. Jones D
19. Machado HE
20. Davies M
21. Øbro NF
22. Mahubani KT
23. Allinson K
24. Gerstung M
25. Saeb-Parsy K
26. Kent DG
27. Laurenti E
28. Stratton MR
29. Rahbari R
30. Campbell PJ
31. Osborne RJ
32. Martincorena I
(2021) Somatic mutation landscapes at single-molecule resolution
Nature 593:405–410.

https://doi.org/10.1038/s41586-021-03477-4
- PubMed
- Google Scholar
1. Agarwal I
2. Przeworski M
(2019) Signatures of replication timing, recombination, and sex in the spectrum of rare variants on the human x chromosome and autosomes
PNAS 116:17916–17924.

https://doi.org/10.1073/pnas.1900714116
- PubMed
- Google Scholar
(2015) Clock-like mutational processes in human somatic cells
Nature Genetics 47:1402–1407.

https://doi.org/10.1038/ng.3441
- PubMed
- Google Scholar
(2020) The repertoire of mutational signatures in human cancer
Nature 578:94–101.

https://doi.org/10.1038/s41586-020-1943-3
- PubMed
- Google Scholar
1. Amster G
2. Sella G
(2016) Life history effects on the molecular clock of autosomes and sex chromosomes
PNAS 113:1588–1593.

https://doi.org/10.1073/pnas.1515798113
- PubMed
- Google Scholar
1. Amster G
2. Sella G
(2020) Life history effects on neutral diversity levels of autosomes and sex chromosomes
Genetics 215:1133–1142.

https://doi.org/10.1534/genetics.120.303119
- PubMed
- Google Scholar
1. Bergeron LA
2. Besenbacher S
3. Bakker J
4. Zheng J
5. Li P
6. Pacheco G
7. Sinding M-HS
8. Kamilari M
9. Gilbert MTP
10. Schierup MH
11. Zhang G
(2021) The germline mutational process in rhesus macaque and its implications for phylogenetic dating
GigaScience 10:giab029.

https://doi.org/10.1093/gigascience/giab029
- PubMed
- Google Scholar
(2019) Direct estimation of mutations in great apes reconciles phylogenetic dating
Nature Ecology & Evolution 3:286–292.

https://doi.org/10.1038/s41559-018-0778-x
- PubMed
- Google Scholar
1. Blokzijl F
2. de Ligt J
3. Jager M
4. Sasselli V
5. Roerink S
6. Sasaki N
7. Huch M
8. Boymans S
9. Kuijk E
10. Prins P
11. Nijman IJ
12. Martincorena I
13. Mokry M
14. Wiegerinck CL
15. Middendorp S
16. Sato T
17. Schwank G
18. Nieuwenhuis EES
19. Verstegen MMA
20. van der Laan LJW
21. de Jonge J
22. IJzermans JNM
23. Vries RG
24. van de Wetering M
25. Stratton MR
26. Clevers H
27. Cuppen E
28. van Boxtel R
(2016) Tissue-specific mutation accumulation in human adult stem cells during life
Nature 538:260–264.

https://doi.org/10.1038/nature19768
- PubMed
- Google Scholar
(2012) Independent contrasts and pgls regression estimators are equivalent
Systematic Biology 61:382–391.

https://doi.org/10.1093/sysbio/syr118
- PubMed
- Google Scholar
(2021) Life-history traits and habitat availability shape genomic diversity in birds: implications for conservation
Proceedings. Biological Sciences 288:20211441.

https://doi.org/10.1098/rspb.2021.1441
- PubMed
- Google Scholar
1. Buffalo V
(2021a) Quantifying the relationship between genetic diversity and population size suggests natural selection cannot explain lewontin’s paradox
eLife 10:e67509.

https://doi.org/10.7554/eLife.67509
- PubMed
- Google Scholar
Software
1. Buffalo V
(2021b) Paradox_variation, version b8f91d5
GitHub.

https://github.com/vsbuffalo/paradox_variation/
1. Cagan A
2. Baez-Ortega A
3. Brzozowska N
4. Abascal F
5. Coorens THH
6. Sanders MA
7. Lawson ARJ
8. Harvey LMR
9. Bhosle S
10. Jones D
11. Alcantara RE
12. Butler TM
13. Hooks Y
14. Roberts K
15. Anderson E
16. Lunn S
17. Flach E
18. Spiro S
19. Januszczak I
20. Wrigglesworth E
21. Jenkins H
22. Dallas T
23. Masters N
24. Perkins MW
25. Deaville R
26. Druce M
27. Bogeska R
28. Milsom MD
29. Neumann B
30. Gorman F
31. Constantino-Casas F
32. Peachey L
33. Bochynska D
34. Smith EStJ
35. Gerstung M
36. Campbell PJ
37. Murchison EP
38. Stratton MR
39. Martincorena I
(2022) Somatic mutation rates scale with lifespan across mammals
Nature 604:517–524.

https://doi.org/10.1038/s41586-022-04618-z
- Google Scholar
1. Campbell CR
2. Tiley GP
3. Poelstra JW
4. Hunnicutt KE
5. Larsen PA
6. Lee H-J
7. Thorne JL
8. Dos Reis M
9. Yoder AD
(2021) Pedigree-based and phylogenetic methods support surprising patterns of mutation rate and spectrum in the gray mouse lemur
Heredity 127:233–244.

https://doi.org/10.1038/s41437-021-00446-5
- PubMed
- Google Scholar
(2000) Male-biased mutation rates revealed from z and w chromosome-linked atp synthase alpha-subunit (atp5a1) sequences in birds
Journal of Molecular Evolution 50:443–447.

https://doi.org/10.1007/s002390010046
- PubMed
- Google Scholar
(1994) Weak male-driven molecular evolution in rodents
PNAS 91:827–831.

https://doi.org/10.1073/pnas.91.2.827
- PubMed
- Google Scholar
(2008) X chromosome activity in mouse xx primordial germ cells
PLOS Genetics 4:e30.

https://doi.org/10.1371/journal.pgen.0040030
- PubMed
- Google Scholar
1. Crow JF
(2000) The origins, patterns and implications of human spontaneous mutation
Nature Reviews. Genetics 1:40–47.

https://doi.org/10.1038/35049558
- PubMed
- Google Scholar
(2009) Characterization of the bovine pseudoautosomal region and comparison with sheep, goat, and other mammalian pseudoautosomal regions
Cytogenetic and Genome Research 126:139–147.

https://doi.org/10.1159/000245913
- PubMed
- Google Scholar
Software
1. de Manuel M
(2022) Mut_sex_bias_amniotes, version swh:1:rev:37da9bdbc2c7cb839de15aadb554cf6c98128add
Software Heritage.

https://archive.softwareheritage.org/swh:1:dir:f7fe3a7dd980fe1199df3de3708f778745ca116d;origin=https://github.com/flw88/mut_sex_bias_amniotes;visit=swh:1:snp:2597b175cff38ce9069514d5ee1b73428c514288;anchor=swh:1:rev:37da9bdbc2c7cb839de15aadb554cf6c98128add
(2011)
In Hormones and Reproduction of Vertebrates

27–70, Avian testicular structure, function, and regulation, In Hormones and Reproduction of Vertebrates, Elsevier.
- Google Scholar
1. Drost JB
2. Lee WR
(1995) Biological basis of germline mutation: comparisons of spontaneous germline mutation rates among Drosophila, mouse, and human
Environmental and Molecular Mutagenesis 25 Suppl 26:48–64.

https://doi.org/10.1002/em.2850250609
- PubMed
- Google Scholar
1. Duret L
2. Galtier N
(2009) Biased gene conversion and the evolution of mammalian genomic landscapes
Annual Review of Genomics and Human Genetics 10:285–311.

https://doi.org/10.1146/annurev-genom-082908-150001
- PubMed
- Google Scholar
1. Ellegren H
2. Fridolfsson AK
(1997) Male-driven evolution of dna sequences in birds
Nature Genetics 17:182–184.

https://doi.org/10.1038/ng1097-182
- PubMed
- Google Scholar
1. Ellegren H
(2010) Evolutionary stasis: the stable chromosomes of birds
Trends in Ecology & Evolution 25:283–291.

https://doi.org/10.1016/j.tree.2009.12.004
- PubMed
- Google Scholar
1. Engel N
(2018) Sex differences in early embryogenesis: inter-chromosomal regulation sets the stage for sex-biased gene networks: the dialogue between the sex chromosomes and autosomes imposes sexual identity soon after fertilization
BioEssays 40:e1800073.

https://doi.org/10.1002/bies.201800073
- PubMed
- Google Scholar
1. Extavour CG
2. Akam M
(2003) Mechanisms of germ cell specification across the metazoans: epigenesis and preformation
Development 130:5869–5884.

https://doi.org/10.1242/dev.00804
- PubMed
- Google Scholar
1. Felsenstein J
(1985) Phylogenies and the comparative method
The American Naturalist 125:1–15.

https://doi.org/10.1086/284325
- Google Scholar
1. Feng S
2. Stiller J
3. Deng Y
4. Armstrong J
5. Fang Q
6. Reeve AH
7. Xie D
8. Chen G
9. Guo C
10. Faircloth BC
11. Petersen B
12. Wang Z
13. Zhou Q
14. Diekhans M
15. Chen W
16. Andreu-Sánchez S
17. Margaryan A
18. Howard JT
19. Parent C
20. Pacheco G
21. Sinding M-HS
22. Puetz L
23. Cavill E
24. Ribeiro ÂM
25. Eckhart L
26. Fjeldså J
27. Hosner PA
28. Brumfield RT
29. Christidis L
30. Bertelsen MF
31. Sicheritz-Ponten T
32. Tietze DT
33. Robertson BC
34. Song G
35. Borgia G
36. Claramunt S
37. Lovette IJ
38. Cowen SJ
39. Njoroge P
40. Dumbacher JP
41. Ryder OA
42. Fuchs J
43. Bunce M
44. Burt DW
45. Cracraft J
46. Meng G
47. Hackett SJ
48. Ryan PG
49. Jønsson KA
50. Jamieson IG
51. da Fonseca RR
52. Braun EL
53. Houde P
54. Mirarab S
55. Suh A
56. Hansson B
57. Ponnikas S
58. Sigeman H
59. Stervander M
60. Frandsen PB
61. van der Zwan H
62. van der Sluis R
63. Visser C
64. Balakrishnan CN
65. Clark AG
66. Fitzpatrick JW
67. Bowman R
68. Chen N
69. Cloutier A
70. Sackton TB
71. Edwards SV
72. Foote DJ
73. Shakya SB
74. Sheldon FH
75. Vignal A
76. Soares AER
77. Shapiro B
78. González-Solís J
79. Ferrer-Obiol J
80. Rozas J
81. Riutort M
82. Tigano A
83. Friesen V
84. Dalén L
85. Urrutia AO
86. Székely T
87. Liu Y
88. Campana MG
89. Corvelo A
90. Fleischer RC
91. Rutherford KM
92. Gemmell NJ
93. Dussex N
94. Mouritsen H
95. Thiele N
96. Delmore K
97. Liedvogel M
98. Franke A
99. Hoeppner MP
100. Krone O
101. Fudickar AM
102. Milá B
103. Ketterson ED
104. Fidler AE
105. Friis G
106. Parody-Merino ÁM
107. Battley PF
108. Cox MP
109. Lima NCB
110. Prosdocimi F
111. Parchman TL
112. Schlinger BA
113. Loiselle BA
114. Blake JG
115. Lim HC
116. Day LB
117. Fuxjager MJ
118. Baldwin MW
119. Braun MJ
120. Wirthlin M
121. Dikow RB
122. Ryder TB
123. Camenisch G
124. Keller LF
125. DaCosta JM
126. Hauber ME
127. Louder MIM
128. Witt CC
129. McGuire JA
130. Mudge J
131. Megna LC
132. Carling MD
133. Wang B
134. Taylor SA
135. Del-Rio G
136. Aleixo A
137. Vasconcelos ATR
138. Mello CV
139. Weir JT
140. Haussler D
141. Li Q
142. Yang H
143. Wang J
144. Lei F
145. Rahbek C
146. Gilbert MTP
147. Graves GR
148. Jarvis ED
149. Paten B
150. Zhang G
(2020) Dense sampling of bird diversity increases power of comparative genomics
Nature 587:252–257.

https://doi.org/10.1038/s41586-020-2873-9
- PubMed
- Google Scholar
(2015) Genome-wide patterns and properties of de novo mutations in humans
Nature Genetics 47:822–826.

https://doi.org/10.1038/ng.3292
- PubMed
- Google Scholar
1. Gao Z
2. Wyman MJ
3. Sella G
4. Przeworski M
(2016) Interpreting the dependence of mutation rates on age and time
PLOS Biology 14:e1002355.

https://doi.org/10.1371/journal.pbio.1002355
- PubMed
- Google Scholar
1. Gao Z
2. Moorjani P
3. Sasani TA
4. Pedersen BS
5. Quinlan AR
6. Jorde LB
7. Amster G
8. Przeworski M
(2019) Overlooked roles of dna damage and maternal age in generating human germline mutations
PNAS 116:9491–9500.

https://doi.org/10.1073/pnas.1901259116
- PubMed
- Google Scholar
1. Goldmann JM
2. Wong WSW
3. Pinelli M
4. Farrah T
5. Bodian D
6. Stittrich AB
7. Glusman G
8. Vissers L
9. Hoischen A
10. Roach JC
11. Vockley JG
12. Veltman JA
13. Solomon BD
14. Gilissen C
15. Niederhuber JE
(2016) Parent-of-origin-specific signatures of de novo mutations
Nature Genetics 48:935–939.

https://doi.org/10.1038/ng.3597
- PubMed
- Google Scholar
(2019) De novo mutations reflect development and aging of the human germline
Trends in Genetics 35:828–839.

https://doi.org/10.1016/j.tig.2019.08.005
- PubMed
- Google Scholar
1. Grafen A
(1989) The phylogenetic regression
Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 326:119–157.

https://doi.org/10.1098/rstb.1989.0106
- PubMed
- Google Scholar
1. Guo F
2. Yan L
3. Guo H
4. Li L
5. Hu B
6. Zhao Y
7. Yong J
8. Hu Y
9. Wang X
10. Wei Y
11. Wang W
12. Li R
13. Yan J
14. Zhi X
15. Zhang Y
16. Jin H
17. Zhang W
18. Hou Y
19. Zhu P
20. Li J
21. Zhang L
22. Liu S
23. Ren Y
24. Zhu X
25. Wen L
26. Gao YQ
27. Tang F
28. Qiao J
(2015) The transcriptome and dna methylome landscapes of human primordial germ cells
Cell 161:1437–1452.

https://doi.org/10.1016/j.cell.2015.05.015
- PubMed
- Google Scholar
1. Guo J
2. Nie X
3. Giebler M
4. Mlcochova H
5. Wang Y
6. Grow EJ
7. Kim R
8. Tharmalingam M
9. Matilionyte G
10. Lindskog C
11. Carrell DT
12. Mitchell RT
13. Goriely A
14. Hotaling JM
15. Cairns BR
16. DonorConnect
(2020) The dynamic transcriptional cell atlas of testis development during human puberty
Cell Stem Cell 26:262–276.

https://doi.org/10.1016/j.stem.2019.12.005
- PubMed
- Google Scholar
Book
1. Guraya SS
(1989) Ovarian Follicles in Reptiles and Birds
Berlin, Heidelberg: Springer-Verlag.

https://doi.org/10.1007/978-3-642-83628-2
- Google Scholar
1. Haldane JBS
(1946) The mutation rate of the gene for haemophilia, and its segregation ratios in males and females
Annals of Eugenics 13:262–271.

https://doi.org/10.1111/j.1469-1809.1946.tb02367.x
- Google Scholar
1. Hall JG
(2003) Twinning
Lancet 362:735–743.

https://doi.org/10.1016/S0140-6736(03)14237-7
- PubMed
- Google Scholar
(2021) Mammalian primordial germ cell specification
Development 148:dev189217.

https://doi.org/10.1242/dev.189217
- PubMed
- Google Scholar
1. Hardison RC
2. Roskin KM
3. Yang S
4. Diekhans M
5. Kent WJ
6. Weber R
7. Elnitski L
8. Li J
9. O’Connor M
10. Kolbe D
11. Schwartz S
12. Furey TS
13. Whelan S
14. Goldman N
15. Smit A
16. Miller W
17. Chiaromonte F
18. Haussler D
(2003) Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution
Genome Research 13:13–26.

https://doi.org/10.1101/gr.844103
- PubMed
- Google Scholar
Preprint
1. Harland C
2. Charlier C
3. Karim L
4. Cambisano N
5. Deckers M
6. Mni M
7. Mullaart E
8. Coppieters W
9. Georges M
(2017) Frequency of Mosaicism Points towards Mutation-Prone Early Cleavage Cell Divisions in Cattle
bioRxiv.

https://doi.org/10.1101/079863
- Google Scholar
1. Hellmann I
2. Prüfer K
3. Ji H
4. Zody MC
5. Pääbo S
6. Ptak SE
(2005) Why do human diversity levels vary at a megabase scale?
Genome Research 15:1222–1231.

https://doi.org/10.1101/gr.3461105
- PubMed
- Google Scholar
Software
1. Hickey G
(2022a) Cactus, version eba5c75
GitHub.

https://github.com/ComparativeGenomicsToolkit/cactus
Software
1. Hickey G
(2022b) Hal, version 378d28a
Github.

https://github.com/ComparativeGenomicsToolkit/hal/
1. Huang W
2. Chang BH
3. Gu X
4. Hewett-Emmett D
5. Li W
(1997) Sex differences in mutation rate in higher primates estimated from amg intron sequences
Journal of Molecular Evolution 44:463–465.

https://doi.org/10.1007/PL00006166
- Google Scholar
(2011) PHAST and rphast: phylogenetic analysis with space/time models
Briefings in Bioinformatics 12:41–51.

https://doi.org/10.1093/bib/bbq072
- PubMed
- Google Scholar
1. Ioannidis J
2. Taylor G
3. Zhao D
4. Liu L
5. Idoko-Akoh A
6. Gong D
7. Lovell-Badge R
8. Guioli S
9. McGrew MJ
10. Clinton M
(2021) Primary sex determination in birds depends on dmrt1 dosage, but gonadal sex does not determine adult secondary sex characteristics
PNAS 118:10.

https://doi.org/10.1073/pnas.2020909118
- PubMed
- Google Scholar
1. Jarvis ED
2. Mirarab S
3. Aberer AJ
4. Li B
5. Houde P
6. Li C
7. Ho SYW
8. Faircloth BC
9. Nabholz B
10. Howard JT
11. Suh A
12. Weber CC
13. da Fonseca RR
14. Li J
15. Zhang F
16. Li H
17. Zhou L
18. Narula N
19. Liu L
20. Ganapathy G
21. Boussau B
22. Bayzid MS
23. Zavidovych V
24. Subramanian S
25. Gabaldón T
26. Capella-Gutiérrez S
27. Huerta-Cepas J
28. Rekepalli B
29. Munch K
30. Schierup M
31. Lindow B
32. Warren WC
33. Ray D
34. Green RE
35. Bruford MW
36. Zhan X
37. Dixon A
38. Li S
39. Li N
40. Huang Y
41. Derryberry EP
42. Bertelsen MF
43. Sheldon FH
44. Brumfield RT
45. Mello CV
46. Lovell PV
47. Wirthlin M
48. Schneider MPC
49. Prosdocimi F
50. Samaniego JA
51. Vargas Velazquez AM
52. Alfaro-Núñez A
53. Campos PF
54. Petersen B
55. Sicheritz-Ponten T
56. Pas A
57. Bailey T
58. Scofield P
59. Bunce M
60. Lambert DM
61. Zhou Q
62. Perelman P
63. Driskell AC
64. Shapiro B
65. Xiong Z
66. Zeng Y
67. Liu S
68. Li Z
69. Liu B
70. Wu K
71. Xiao J
72. Yinqi X
73. Zheng Q
74. Zhang Y
75. Yang H
76. Wang J
77. Smeds L
78. Rheindt FE
79. Braun M
80. Fjeldsa J
81. Orlando L
82. Barker FK
83. Jønsson KA
84. Johnson W
85. Koepfli K-P
86. O’Brien S
87. Haussler D
88. Ryder OA
89. Rahbek C
90. Willerslev E
91. Graves GR
92. Glenn TC
93. McCormack J
94. Burt D
95. Ellegren H
96. Alström P
97. Edwards SV
98. Stamatakis A
99. Mindell DP
100. Cracraft J
101. Braun EL
102. Warnow T
103. Jun W
104. Gilbert MTP
105. Zhang G
(2014) Whole-genome analyses resolve early branches in the tree of life of modern birds
Science 346:1320–1331.

https://doi.org/10.1126/science.1253451
- PubMed
- Google Scholar
(2017) Parental influence on human germline de novo mutations in 1,548 trios from iceland
Nature 549:519–522.

https://doi.org/10.1038/nature24018
- PubMed
- Google Scholar
(2021) Differences between germline genomes of monozygotic twins
Nature Genetics 53:27–34.

https://doi.org/10.1038/s41588-020-00755-1
- PubMed
- Google Scholar
1. Ju YS
2. Martincorena I
3. Gerstung M
4. Petljak M
5. Alexandrov LB
6. Rahbari R
7. Wedge DC
8. Davies HR
9. Ramakrishna M
10. Fullam A
11. Martin S
12. Alder C
13. Patel N
14. Gamble S
15. O’Meara S
16. Giri DD
17. Sauer T
18. Pinder SE
19. Purdie CA
20. Borg Å
21. Stunnenberg H
22. van de Vijver M
23. Tan BKT
24. Caldas C
25. Tutt A
26. Ueno NT
27. van ’t Veer LJ
28. Martens JWM
29. Sotiriou C
30. Knappskog S
31. Span PN
32. Lakhani SR
33. Eyfjörd JE
34. Børresen-Dale A-L
35. Richardson A
36. Thompson AM
37. Viari A
38. Hurles ME
39. Nik-Zainal S
40. Campbell PJ
41. Stratton MR
(2017) Somatic mutations reveal asymmetric cellular dynamics in the early human embryo
Nature 543:714–718.

https://doi.org/10.1038/nature21703
- PubMed
- Google Scholar
Book
1. Kimura M
(1983) The Neutral Theory of Molecular Evolution
New York: Cambridge University Press.

https://doi.org/10.1017/CBO9780511623486
- Google Scholar
(2012) Rate of de novo mutations and the importance of father’s age to disease risk
Nature 488:471–475.

https://doi.org/10.1038/nature11396
- PubMed
- Google Scholar
1. Koren A
2. Polak P
3. Nemesh J
4. Michaelson JJ
5. Sebat J
6. Sunyaev SR
7. McCarroll SA
(2012) Differential relationship of dna replication timing to different forms of human mutation and variation
American Journal of Human Genetics 91:1033–1040.

https://doi.org/10.1016/j.ajhg.2012.10.018
- PubMed
- Google Scholar
(1996) Rates of nucleotide substitution in primates and rodents and the generation-time effect hypothesis
Molecular Phylogenetics and Evolution 5:182–187.

https://doi.org/10.1006/mpev.1996.0012
- PubMed
- Google Scholar
1. Li WH
2. Yi S
3. Makova K
(2002) Male-driven evolution
Current Opinion in Genetics & Development 12:650–656.

https://doi.org/10.1016/s0959-437x(02)00354-4
- PubMed
- Google Scholar
1. Li G
2. Hillier LW
3. Grahn RA
4. Zimin AV
5. David VA
6. Menotti-Raymond M
7. Middleton R
8. Hannah S
9. Hendrickson S
10. Makunin A
11. O’Brien SJ
12. Minx P
13. Wilson RK
14. Lyons LA
15. Warren WC
16. Murphy WJ
(2016) A high-resolution snp array-based linkage map anchors A new domestic cat draft genome assembly and provides detailed patterns of recombination
G3: Genes, Genomes, Genetics 6:1607–1616.

https://doi.org/10.1534/g3.116.028746
- PubMed
- Google Scholar
1. Lin Y-T
2. Capel B
(2015) Cell fate commitment during mammalian sex determination
Current Opinion in Genetics & Development 32:144–152.

https://doi.org/10.1016/j.gde.2015.03.003
- PubMed
- Google Scholar
1. Lindsay SJ
2. Rahbari R
3. Kaplanis J
4. Keane T
5. Hurles ME
(2019) Similarities and differences in patterns of germline mutation between mice and humans
Nature Communications 10:1–12.

https://doi.org/10.1038/s41467-019-12023-w
- PubMed
- Google Scholar
1. Liu R
2. Low WY
3. Tearle R
4. Koren S
5. Ghurye J
6. Rhie A
7. Phillippy AM
8. Rosen BD
9. Bickhart DM
10. Smith TPL
11. Hiendleder S
12. Williams JL
(2019) New insights into mammalian sex chromosome structure and evolution using high-quality sequences from bovine x and y chromosomes
BMC Genomics 20:1–11.

https://doi.org/10.1186/s12864-019-6364-z
- PubMed
- Google Scholar
1. Lodato MA
2. Rodin RE
3. Bohrson CL
4. Coulter ME
5. Barton AR
6. Kwon M
7. Sherman MA
8. Vitzthum CM
9. Luquette LJ
10. Yandava CN
11. Yang P
12. Chittenden TW
13. Hatem NE
14. Ryu SC
15. Woodworth MB
16. Park PJ
17. Walsh CA
(2018) Aging and neurodegeneration are associated with increased mutations in single human neurons
Science 359:555–559.

https://doi.org/10.1126/science.aao4426
- PubMed
- Google Scholar
1. Makova KD
2. Li W-H
(2002) Strong male-driven evolution of dna sequences in humans and apes
Nature 416:624–626.

https://doi.org/10.1038/416624a
- PubMed
- Google Scholar
1. Meunier J
2. Duret L
(2004) Recombination drives the evolution of gc-content in the human genome
Molecular Biology and Evolution 21:984–990.

https://doi.org/10.1093/molbev/msh070
- PubMed
- Google Scholar
(1987) Male-driven molecular evolution: a model and nucleotide sequence analysis
Quantitative Biology 52:863–867.

https://doi.org/10.1101/sqb.1987.052.01.094
- PubMed
- Google Scholar
1. Moore L
2. Cagan A
3. Coorens THH
4. Neville MDC
5. Sanghvi R
6. Sanders MA
7. Oliver TRW
8. Leongamornlert D
9. Ellis P
10. Noorani A
11. Mitchell TJ
12. Butler TM
13. Hooks Y
14. Warren AY
15. Jorgensen M
16. Dawson KJ
17. Menzies A
18. O’Neill L
19. Latimer C
20. Teng M
21. van Boxtel R
22. Iacobuzio-Donahue CA
23. Martincorena I
24. Heer R
25. Campbell PJ
26. Fitzgerald RC
27. Stratton MR
28. Rahbari R
(2021) The mutational landscape of human somatic and germline cells
Nature 597:381–386.

https://doi.org/10.1038/s41586-021-03822-7
- PubMed
- Google Scholar
(2016) Variation in the molecular clock of primates
PNAS 113:10607–10612.

https://doi.org/10.1073/pnas.1600374113
- PubMed
- Google Scholar
1. Nachman MW
2. Crowell SL
(2000) Estimate of the mutation rate per nucleotide in humans
Genetics 156:297–304.

https://doi.org/10.1093/genetics/156.1.297
- PubMed
- Google Scholar
1. Oliver TRW
2. Chappell L
3. Sanghvi R
4. Deighton L
5. Ansari-Pour N
6. Dentro SC
7. Young MD
8. Coorens THH
9. Jung H
10. Butler T
11. Neville MDC
12. Leongamornlert D
13. Sanders MA
14. Hooks Y
15. Cagan A
16. Mitchell TJ
17. Cortes-Ciriano I
18. Warren AY
19. Wedge DC
20. Heer R
21. Coleman N
22. Murray MJ
23. Campbell PJ
24. Rahbari R
25. Behjati S
(2022) Clonal diversification and histogenesis of malignant germ cell tumours
Nature Communications 13:4272.

https://doi.org/10.1038/s41467-022-31375-4
- PubMed
- Google Scholar
1. Pagel M
(1999) Inferring the historical patterns of biological evolution
Nature 401:877–884.

https://doi.org/10.1038/44766
- PubMed
- Google Scholar
1. Pecon Slattery J
2. O’Brien SJ
(1998) Patterns of y and x chromosome dna sequence divergence during the felidae radiation
Genetics 148:1245–1255.

https://doi.org/10.1093/genetics/148.3.1245
- PubMed
- Google Scholar
Software
1. Pedersen B
(2022) Mosdepth, version 556a460
GitHub.

https://github.com/brentp/mosdepth
1. Penrose LS
(1955) Parental age and mutation
Lancet 269:312–313.

https://doi.org/10.1016/s0140-6736(55)92305-9
- PubMed
- Google Scholar
1. Pink CJ
2. Hurst LD
(2010) Timing of replication is a determinant of neutral substitution rates but does not explain slow y chromosome evolution in rodents
Molecular Biology and Evolution 27:1077–1086.

https://doi.org/10.1093/molbev/msp314
- PubMed
- Google Scholar
1. Presgraves DC
2. Yi SV
(2009) Doubts about complex speciation between humans and chimpanzees
Trends in Ecology & Evolution 24:533–540.

https://doi.org/10.1016/j.tree.2009.04.007
- PubMed
- Google Scholar
1. Prum RO
2. Berv JS
3. Dornburg A
4. Field DJ
5. Townsend JP
6. Lemmon EM
7. Lemmon AR
(2015) A comprehensive phylogeny of birds (aves) using targeted next-generation dna sequencing
Nature 526:569–573.

https://doi.org/10.1038/nature15697
- PubMed
- Google Scholar
1. Rahbari R
2. Wuster A
3. Lindsay SJ
4. Hardwick RJ
5. Alexandrov LB
6. Turki SA
7. Dominiczak A
8. Morris A
9. Porteous D
10. Smith B
11. Stratton MR
12. UK10K Consortium
13. Hurles ME
(2016) Timing, rates and spectra of human germline mutation
Nature Genetics 48:126–133.

https://doi.org/10.1038/ng.3469
- PubMed
- Google Scholar
1. Raudsepp T
2. Chowdhary BP
(2015) The eutherian pseudoautosomal region
Cytogenetic and Genome Research 147:81–94.

https://doi.org/10.1159/000443157
- PubMed
- Google Scholar
1. Reddy S
2. Kimball RT
3. Pandey A
4. Hosner PA
5. Braun MJ
6. Hackett SJ
7. Han K-L
8. Harshman J
9. Huddleston CJ
10. Kingston S
11. Marks BD
12. Miglia KJ
13. Moore WS
14. Sheldon FH
15. Witt CC
16. Yuri T
17. Braun EL
(2017) Why do phylogenomic data sets yield conflicting trees? data type influences the avian tree of life more than taxon sampling
Systematic Biology 66:857–879.

https://doi.org/10.1093/sysbio/syx041
- PubMed
- Google Scholar
(2022) Oocytes maintain ros-free mitochondrial metabolism by suppressing complex i
Nature 607:756–761.

https://doi.org/10.1038/s41586-022-04979-5
- Google Scholar
1. Sasani TA
2. Pedersen BS
3. Gao Z
4. Baird L
5. Przeworski M
6. Jorde LB
7. Quinlan AR
(2019) Large, three-generation human families reveal post-zygotic mosaicism and variability in germline mutation accumulation
eLife 8:e46922.

https://doi.org/10.7554/eLife.46922
- PubMed
- Google Scholar
1. Schield DR
2. Card DC
3. Hales NR
4. Perry BW
5. Pasquesi GM
6. Blackmon H
7. Adams RH
8. Corbin AB
9. Smith CF
10. Ramesh B
11. Demuth JP
12. Betrán E
13. Tollis M
14. Meik JM
15. Mackessy SP
16. Castoe TA
(2019) The origins and evolution of chromosomes, dosage compensation, and mechanisms underlying venom regulation in snakes
Genome Research 29:590–601.

https://doi.org/10.1101/gr.240952.118
- PubMed
- Google Scholar
(2021) Population genomic analyses confirm male-biased mutation rates in snakes
The Journal of Heredity 112:221–227.

https://doi.org/10.1093/jhered/esab005
- PubMed
- Google Scholar
(2021) Population sequencing data reveal a compendium of mutational processes in the human germ line
Science 373:1030–1035.

https://doi.org/10.1126/science.aba7408
- PubMed
- Google Scholar
1. Shearn R
2. Wright AE
3. Mousset S
4. Régis C
5. Penel S
6. Lemaitre J-F
7. Douay G
8. Crouau-Roy B
9. Lecompte E
10. Marais GA
(2020) Evolutionary stasis of the pseudoautosomal boundary in strepsirrhine primates
eLife 9:e63650.

https://doi.org/10.7554/eLife.63650
- PubMed
- Google Scholar
(1993a) Potential problems in estimating the male-to-female mutation rate ratio from dna sequence data
Journal of Molecular Evolution 37:160–166.

https://doi.org/10.1007/BF02407351
- PubMed
- Google Scholar
(1993b) Male-driven evolution of dna sequences
Nature 362:745–747.

https://doi.org/10.1038/362745a0
- PubMed
- Google Scholar
1. Siepel A.
2. Haussler D
(2004) Phylogenetic estimation of context-dependent substitution rates by maximum likelihood
Molecular Biology and Evolution 21:468–488.

https://doi.org/10.1093/molbev/msh039
- PubMed
- Google Scholar
1. Siepel Adam
2. Bejerano G
3. Pedersen JS
4. Hinrichs AS
5. Hou M
6. Rosenbloom K
7. Clawson H
8. Spieth J
9. Hillier LW
10. Richards S
11. Weinstock GM
12. Wilson RK
13. Gibbs RA
14. Kent WJ
15. Miller W
16. Haussler D
(2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes
Genome Research 15:1034–1050.

https://doi.org/10.1101/gr.3715005
- PubMed
- Google Scholar
(2013) Regions of xy homology in the pig x chromosome and the boundary of the pseudoautosomal region
BMC Genetics 14:1–7.

https://doi.org/10.1186/1471-2156-14-3
- PubMed
- Google Scholar
1. Smeds L
2. Kawakami T
3. Burri R
4. Bolivar P
5. Husby A
6. Qvarnström A
7. Uebbing S
8. Ellegren H
(2014) Genomic identification and characterization of the pseudoautosomal region in highly differentiated avian sex chromosomes
Nature Communications 5:1–7.

https://doi.org/10.1038/ncomms6448
- PubMed
- Google Scholar
(2016) Direct estimate of the rate of germline mutation in a bird
Genome Research 26:1211–1218.

https://doi.org/10.1101/gr.204669.116
- PubMed
- Google Scholar
1. Smith TB
2. Dun MD
3. Smith ND
4. Curry BJ
5. Connaughton HS
6. Aitken RJ
(2013) The presence of a truncated base excision repair pathway in human spermatozoa that is mediated by ogg1
Journal of Cell Science 126:1488–1497.

https://doi.org/10.1242/jcs.121657
- PubMed
- Google Scholar
1. Soler L
2. Alves S
3. Brionne A
4. Jacques A
5. Guérin V
6. Cherif-Feildel M
7. Combes-Soia L
8. Fouchécourt S
9. Thélie A
10. Blesbois E
11. McGrew MJ
12. Labas V
13. Govoroun MS
(2021) Protein expression reveals a molecular sexual identity of avian primordial germ cells at pre-gonadal stages
Scientific Reports 11:1–19.

https://doi.org/10.1038/s41598-021-98454-2
- PubMed
- Google Scholar
(2017) Sex determination in the mammalian germline
Annual Review of Genetics 51:265–285.

https://doi.org/10.1146/annurev-genet-120215-035449
- PubMed
- Google Scholar
Book
1. Strachan T
2. Read A
(2018) Human Molecular Genetics
New York: Garland Science.

https://doi.org/10.4324/9780203833544
- Google Scholar
1. Tatsumoto S
2. Go Y
3. Fukuta K
4. Noguchi H
5. Hayakawa T
6. Tomonaga M
7. Hirai H
8. Matsuzawa T
9. Agata K
10. Fujiyama A
(2017) Direct estimation of de novo mutation rates in a chimpanzee parent-offspring trio by ultra-deep whole genome sequencing
Scientific Reports 7:1–12.

https://doi.org/10.1038/s41598-017-13919-7
- PubMed
- Google Scholar
1. Thomas GWC
2. Wang RJ
3. Puri A
4. Harris RA
5. Raveendran M
6. Hughes DST
7. Murali SC
8. Williams LE
9. Doddapaneni H
10. Muzny DM
11. Gibbs RA
12. Abee CR
13. Galinski MR
14. Worley KC
15. Rogers J
16. Radivojac P
17. Hahn MW
(2018) Reproductive longevity predicts mutation rates in primates
Current Biology 28:3193–3197.

https://doi.org/10.1016/j.cub.2018.08.050
- PubMed
- Google Scholar
1. Vilenchik MM
2. Knudson AG
(2003) Endogenous dna double-strand breaks: production, fidelity of repair, and induction of cancer
PNAS 100:12871–12876.

https://doi.org/10.1073/pnas.2135498100
- PubMed
- Google Scholar
1. Wang Z
2. Zhang J
3. Yang W
4. An N
5. Zhang P
6. Zhang G
7. Zhou Q
(2014) Temporal genomic evolution of bird sex chromosomes
BMC Evolutionary Viology 14:250.

https://doi.org/10.1186/s12862-014-0250-8
- PubMed
- Google Scholar
1. Wang RJ
2. Thomas GWC
3. Raveendran M
4. Harris RA
5. Doddapaneni H
6. Muzny DM
7. Capitanio JP
8. Radivojac P
9. Rogers J
10. Hahn MW
(2020) Paternal age in rhesus macaques is positively associated with germline mutation accumulation but not with measures of offspring sociability
Genome Research 30:826–834.

https://doi.org/10.1101/gr.255174.119
- PubMed
- Google Scholar
Preprint
1. Wang RJ
2. Peña-Garcia Y
3. Bibby M
4. Raveendran M
5. Harris RA
6. Jansen HT
7. Robbins CT
8. Rogers J
9. Kelley JL
10. Hahn MW
(2022a) Hibernation Shows No Apparent Effect on Germline Mutation Rates in Grizzly Bears
bioRxiv.

https://doi.org/10.1101/2022.03.15.481369
- Google Scholar
1. Wang RJ
2. Raveendran M
3. Harris RA
4. Murphy WJ
5. Lyons LA
6. Rogers J
7. Hahn MW
(2022b) De novo mutations in domestic cat are consistent with an effect of reproductive longevity on both the rate and spectrum of mutations
Molecular Biology and Evolution 39:msac147.

https://doi.org/10.1093/molbev/msac147
- PubMed
- Google Scholar
1. Waters PD
2. Patel HR
3. Ruiz-Herrera A
4. Álvarez-González L
5. Lister NC
6. Simakov O
7. Ezaz T
8. Kaur P
9. Frere C
10. Grützner F
11. Georges A
12. Graves JAM
(2021) Microchromosomes are building blocks of bird, reptile, and mammal chromosomes
PNAS 118:45.

https://doi.org/10.1073/pnas.2112494118
- Google Scholar
(2011) Do variations in substitution rates and male mutation bias correlate with life-history traits? a study of 32 mammalian genomes
Evolution; International Journal of Organic Evolution 65:2800–2815.

https://doi.org/10.1111/j.1558-5646.2011.01337.x
- PubMed
- Google Scholar
1. Wolfe KH
2. Li W-H
(2003) Molecular evolution meets the genomics revolution
Nature Genetics 33 Suppl:255–265.

https://doi.org/10.1038/ng1088
- PubMed
- Google Scholar
1. Wong WSW
2. Solomon BD
3. Bodian DL
4. Kothiyal P
5. Eley G
6. Huddleston KC
7. Baker R
8. Thach DC
9. Iyer RK
10. Vockley JG
11. Niederhuber JE
(2016) New observations on maternal age effect on germline de novo mutations
Nature Communications 7:1–10.

https://doi.org/10.1038/ncomms10486
- Google Scholar
1. Wu FL
2. Strand AI
3. Cox LA
4. Ober C
5. Wall JD
6. Moorjani P
7. Przeworski M
(2020) A comparison of humans and baboons suggests germline mutation rates do not track cell divisions
PLOS Biology 18:e3000838.

https://doi.org/10.1371/journal.pbio.3000838
- PubMed
- Google Scholar
1. Yang C
2. Zhou Y
3. Marcus S
4. Formenti G
5. Bergeron LA
6. Song Z
7. Bi X
8. Bergman J
9. Rousselle MMC
10. Zhou C
11. Zhou L
12. Deng Y
13. Fang M
14. Xie D
15. Zhu Y
16. Tan S
17. Mountcastle J
18. Haase B
19. Balacco J
20. Wood J
21. Chow W
22. Rhie A
23. Pippel M
24. Fabiszak MM
25. Koren S
26. Fedrigo O
27. Freiwald WA
28. Howe K
29. Yang H
30. Phillippy AM
31. Schierup MH
32. Jarvis ED
33. Zhang G
(2021) Evolutionary and biomedical insights from a marmoset diploid genome assembly
Nature 594:227–233.

https://doi.org/10.1038/s41586-021-03535-x
- PubMed
- Google Scholar
1. Zhao D
2. McBride D
3. Nandi S
4. McQueen HA
5. McGrew MJ
6. Hocking PM
7. Lewis PD
8. Sang HM
9. Clinton M
(2010) Somatic sex identity is cell autonomous in the chicken
Nature 464:237–242.

https://doi.org/10.1038/nature08852
- PubMed
- Google Scholar
1. Zhou Z-X
2. Lujan SA
3. Burkholder AB
4. St Charles J
5. Dahl J
6. Farrell CE
7. Williams JS
8. Kunkel TA
(2021) How asymmetric dna replication achieves symmetrical fidelity
Nature Structural & Molecular Biology 28:1020–1028.

https://doi.org/10.1038/s41594-021-00691-6
- PubMed
- Google Scholar
1. Zoonomia Consortium
(2020) A comparative genomics multitool for scientific discovery and conservation
Nature 587:240–245.

https://doi.org/10.1038/s41586-020-2876-6
- PubMed
- Google Scholar

Article and author information

Author details

Marc de Manuel

Department of Biological Sciences, Columbia University, New York, United States

Contribution
Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing

Contributed equally with
Felix L Wu

For correspondence
md3914@columbia.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1245-0127
Felix L Wu

Department of Biological Sciences, Columbia University, New York, United States

Contribution
Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing

Contributed equally with
Marc de Manuel

For correspondence
flw2113@cumc.columbia.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-0155-9071
Molly Przeworski

Department of Systems Biology, Columbia University, New York, United States

Contribution
Conceptualization, Supervision, Funding acquisition, Writing – original draft, Project administration, Writing – review and editing

For correspondence
mp3284@columbia.edu

Competing interests
Senior editor, eLife

"This ORCID iD identifies the author of this article:" 0000-0002-5369-9009

Funding

National Institutes of Health (GM122975)

Molly Przeworski

Human Frontier Science Program (LT000257/2021-L)

Marc de Manuel

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Ziyue Gao, Guy Sella, and the Coop and Schierup labs for their comments on earlier versions of the manuscript. We thank Rusty Lansford, Mike McGrew, and Daniel Hooper for discussions about avian development and evolution; Turk Rhen for discussions about reptile sex determination; Carla Hoge and Zach Fuller for sharing their corn snake genome assembly; Anne Bronikowski and the Vertebrate Genome Project for sponsoring and generating the Thamnophis elegans assembly; Alex Cagan for early access to data of mutation burdens in colonic crypts across mammals; Richard Wang and Matthew Hahn for sharing data on DNM in cats; Carole Charlier and Michel Georges for sharing data on DNM in cattle; Adam Siepel for help with applying the phyloFit program; and Peter Andolfatto, Michael B Eisen, Priya Moorjani, as well as William R Milligan, Anna Yoney, and other members of the Andolfatto, Przeworski, and Sella labs for helpful discussions. This work was funded by GM122975 to MP and an HFSP postdoctoral fellowship to MdM.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.