Contingency and selection in mitochondrial genome dynamics

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Appendix 1
Appendix 2
Data availability
References
Article and author information
Metrics

Abstract

High frequencies of mutant mitochondrial DNA (mtDNA) in human cells lead to cellular defects that are associated with aging and disease. Yet much remains to be understood about the dynamics of the generation of mutant mtDNAs and their relative replicative fitness that informs their fate within cells and tissues. To address this, we utilize long-read single-molecule sequencing to track mutational trajectories of mtDNA in the model organism Saccharomyces cerevisiae. This model has numerous advantages over mammalian systems due to its much larger mtDNA and ease of artificially competing mutant and wild-type mtDNA copies in cells. We show a previously unseen pattern that constrains subsequent excision events in mtDNA fragmentation in yeast. We also provide evidence for the generation of rare and contentious non-periodic mtDNA structures that lead to persistent diversity within individual cells. Finally, we show that measurements of relative fitness of mtDNA fit a phenomenological model that highlights important biophysical parameters governing mtDNA fitness. Altogether, our study provides techniques and insights into the dynamics of large structural changes in genomes that we show are applicable to more complex organisms like humans.

Editor's evaluation

This work provides further insight into long-time outstanding questions in the field of mitochondrial genetics using long-read sequence analysis and biophysical modeling. Specifically, the authors show how replication origins in mitochondrial DNA are recombination hotspots that can result in excision cascades, that lead to a variety of different mitochondrial mutants and in some cases even heteroplasmic cells. Finally, crossing wild-type cells with various mitochondrial mutants allowed the development of a model for the suppressivity (fitness) of different mitochondrial variants that suggests that the density of replication origins in different repeated units is a major determinant of mtDNA suppressivity.

https://doi.org/10.7554/eLife.76557.sa0

Introduction

The mitochondrial DNA (mtDNA) in eukaryotic cells encodes a subset of enzymes involved in cellular respiration. Interestingly, the integrity of mtDNA has been implicated in critical biological processes other than respiration such as in apoptosis, trace element and intermediary metabolism, heme synthesis, and iron-sulfur cluster biogenesis (Veatch et al., 2009). Because mtDNA exists in multiple copies within numerous mitochondrial compartments, localized mtDNA damage produces heteroplasmic states with coexisting mutant and wild-type mtDNA in cells. Both intracellular mtDNA dynamics and intercellular selection then ultimately shape the fate of cell populations, with mtDNA damage resulting in cellular defects in single-celled organisms such as yeast and aging and disease in multicellular organisms such as humans.

In humans, large mtDNA deletions accumulate during the course of aging in skeletal muscle and brain tissue (Fayet et al., 2002; Kraytsberg et al., 2006; Payne and Chinnery, 2015), and result in observable cellular defects (Chan, 2006). The same types of deletions have also been implicated in numerous diseases, including Parkinson’s disease (Bender et al., 2006). While these mtDNA deletions have been widely observed, much remains to be understood about the dynamics that lead to the propagation of these deletions within cells and resultantly in the tissues of humans. This is partly because of the strong dependence of human cells on mtDNA for their survival, and the complexities in artificially creating heteroplasmy or modifying mtDNA in mammalian systems (Mok et al., 2020). To address some outstanding questions regarding the dynamics of mutant mtDNA, in this article, we explore mtDNA dynamics using yeast as a model organism. Yeast is particularly well suited to study mtDNA dynamics due to the dispensability of mtDNA and because heteroplasmic cells containing mutant and wild-type mtDNA are easy to artificially construct.

In yeast, mtDNA deletions were first linked to the Petite phenotype (Ephrussi, 1949; Ephrussi, 1953; Ephrussi et al., 1949). Petite colonies are smaller than their wild-type counterparts under respiration conditions due to mtDNA deletions that render cells incapable of respiration. These deletions are due to destructive recombination events between short repeated homology in mtDNA that excise portions of the wild-type genome (Bernardi et al., 1976; Bernardi and Bernardi, 1980; Marotta et al., 1982; de Zamaroczy et al., 1983). Excision events are followed by selection for sub-genomic (nonfunctional) mtDNA fragments that contain a high density of replication origins (Goursot et al., 1980; Blanc and Dujon, 1980; de Zamaroczy et al., 1979; de Zamaroczy et al., 1981). When subgenomic fragments have replication origin densities higher than the larger wild-type genome, they consistently outcompete or ‘suppress’ wild-type genomes within cells. Suppressivity, which is a measure of this replicative advantage of Petite mtDNAs over wild-type, was shown to correlate with origin density and was reduced when replication origins were disrupted or absent, constituting the rules of suppressivity (de Zamaroczy et al., 1981; Mangin et al., 1983; Bernardi, 2005). Altogether, rolling circle replication coupled with this excision and selection for replication origins results in the formation of complex concatemer structures in the mtDNA of Petite cells that often contain multiple replication origins from distant locations of the wild-type genome (Locker et al., 1974; Locker et al., 1979; Faugeron-Fonty et al., 1979).

Near the time of the complete sequencing of the mitochondrial genome of yeast in 1998 (Foury et al., 1998), work on the structural details of mtDNA that lead to the aforementioned discoveries in Petites appeared to wane. However, a number of open questions about the dynamics of Petite mtDNAs, which are at their core a result of mtDNA deletions, remain to be explored fully.

Regarding the structure and generation of Petite mtDNA, three questions remained to be addressed. These include: (1) What drives mtDNA excision events in Petites to cluster near replication origins? Previous work shows that excisions occur all throughout the genome but with a higher density near replication origins (Bernardi and Bernardi, 1980; de Zamaroczy et al., 1983; de Zamaroczy and Bernardi, 1986; Marotta et al., 1982; Osman et al., 2015). The interplay between location-specific excision frequencies and selection for origin-containing fragments remains entangled. (2) What is the nature and dynamics of the ongoing excision cascades in Petites, that is, how do subsequent excisions relate to previous excisions? Previous work in Bernardi et al., 1976; Lewin et al., 1978; Lewin et al., 1979; Locker et al., 1979 showed qualitatively that persistent heterogeneity in mtDNA content was present in the sequencing of Petite strains, pointing to continuing excision events. However, these works did not provide a quantitative description of this heterogeneity or explore the relationship between the structure of the coexisting mtDNAs. (3) Are the contentious and rare non-periodic mtDNA structures observed in yeast real? The studies of Heyting et al., 1979; Bos et al., 1980 provides evidence for non-periodic structures, which is unexpected given that rolling circle replication produces periodic, tandemly duplicated structures. The work in Faugeron-Fonty et al., 1983 refutes these observations, providing a conflicting hypothesis which remains to be reconciled.

Concerning the distribution of mtDNAs at a cell population level, another open question is: (4) How is the observed structural heterogeneity of mtDNA in yeast colonies partitioned among individual cells? The work in Lewin et al., 1978; Lewin et al., 1979; Locker et al., 1979 points to homoplasmic contributions to Petite-colony heterogeneity. The extent of homoplasmic and heteroplasmic contributions to colony-level heterogeneity remains to be quantified.

And finally, given an understanding of both mtDNA structure dynamics and its partitioning in populations, the final question we address is related to the structure-function relationship of mtDNA: (5) What contributes to the fitness of mtDNA structures, and how does structure inform suppressivity? The suppressivity rules provided previously in de Zamaroczy et al., 1981, which described how mtDNA structure influenced suppressivity, were limited to reduced Petite genomes with small sizes and relatively high suppressivities. Do these same rules explain suppressivity across a larger range of genome structures and suppressivities, and can we construct a biophysical model of suppressivity that relies on these rules?

In this study, we address each of these long-standing questions with new long-read sequencing technology and accompanying structural inference methods. We highlight some advantages of Nanopore sequencing in addressing these questions and future ones, but also technical challenges specific to structure reconstruction with Nanopore sequencing of the mtDNA in yeast. Among new answers to all of the aforementioned questions, we showcase a previously unseen pattern that constrains subsequent excision events in generating new Petite mtDNA structures from existing ones, settle contention in the literature surrounding the existence and generation of non-periodic ‘mixed’ Petite structures, and propose a phenomenological model of suppressivity that highlights important biophysical parameters governing mtDNA fitness. Finally, we connect these observations in yeast to mtDNA deletions in humans which exhibit remarkably similar patterns.

Results

Overview of the structure of Grande and Petite mtDNA

To quantify mtDNA structure and their dynamics, we opted to sequence both Petite and Grande colonies with a Nanopore MinION single-molecule sequencing platform. We expected the long reads generated from this sequencing technology to improve structure reconstruction for both high and low frequency structures compared to short-read sequencing approaches. In total, we sequenced 38 Petite colonies derived from 9 spontaneous Petite colonies through passaging and 10 Grande (wild-type) colonies of the same Saccharomyces cerevisiae strain. Four of these Grande colonies were cultured under non-fermentable media (YPG), and six in fermentable media (YPAD). Starting with nine spontaneous Petite colonies, each colony was passaged twice onto new media (YPAD), storing and culturing three colonies at each passage. This generated families of Petite colonies, with nine colonies sharing each spontaneous progenitor after two passages (Figure 1a). The suppressivities of all colonies sharing a progenitor were measured (see Materials and methods), but only a subset was sequenced (dotted circles in Figure 1a). Subfamilies, labeled in Figure 1a as a subscript, were grouped based on differing mtDNA content from other members in the same family. The coverage curves from the sequencing of each Petite colony and a subset of Grande colonies are shown in Figure 1b and provide a coarse picture of their mtDNA content. It is evident that some Petite colonies within families, such as families 1 and 4, have differing mtDNA content but share the same spontaneous Petite colony progenitor. This diversity is most likely the result of early mtDNA instability in spontaneous progenitor colonies that segregate into different cells through genome bottlenecks and are sampled through passaging. It is also possible that this diversity is a result of ongoing mtDNA changes during the growth of the colony before sequencing. Nevertheless, comparing the mtDNA content in colonies that share second passage progenitors reveals that two passages followed by culturing (~32 generations) was sufficient to homogenize mtDNA content in all cases except family 1. Family 1 we believe to be a special case where early mtDNA instability occurred in the spontaneous progenitor, and then again in the second passage progenitor of colonies 1b and 1c or during their growth. Given that we sampled such a case in our experiments suggests roughly 1 in 10 chances of such events, but would require a larger study to quantify it.

Figure 1

Download asset Open asset

Overview of the experiment and observed mtDNA diversity in sequenced yeast colonies.

(a) An overview of the architecture of the Petite colony sequencing experiment in this study. Nine spontaneous Petite colonies were passaged twice onto new media, culturing and storing three colonies for each passage. This produced families of colonies (indicated by color), where all colonies after two passages were derived from the same spontaneous Petite colony progenitor, but only a subset of colonies was cultured and then sequenced with a Nanopore MinION sequencing device (indicated by dotted circles). In addition to families, subfamilies are labeled as a subscript and grouped based on the predominant mtDNA structure present in these colonies according to sequencing results. (b) The sequencing coverage (arbitrary coverage scaling, consistent genome reference location) in all Petite colonies in addition to a subset of wild-type (WT), or Grande colonies sequenced. Ten Grande colonies were sequenced as a reference, four after growth under non-fermentable conditions (YPG), and six under fermentable conditions (YPAD). Border colors correspond to (a), and black borders are examples of Grande colony coverages.

Mapping of the mtDNA to a reference sequence, followed by careful filtering of inverted duplication artifacts (Appendix 1—figure 1) and clustering of alignment breakpoint signals with a variety of parameters (see Materials and methods), revealed both inverted and non-inverted mtDNA breakpoints in all Petite colonies and rare mtDNA breakpoints in Grande colonies. These breakpoint signals delineate sequence alignments that are collinear with the reference mtDNA sequence, but merged in such a way that disjoint alignment locations on the reference genome have been brought together. Non-inverted breakpoints indicate the merging of disjoint sequences in the reference from the same strand, or with the same orientation, while inverted breakpoints indicate the merging of disjoint mtDNA sequences on opposite strands (Figure 2a). Long reads with an average read length of 6 kbp and maximum length of 120 kbp directly revealed that these breakpoint signals were contributed by concatemer structures in Petites, composed of tandem repeats of sub-genome sized repeat units that had been excised from the wild-type genome and amplified into repeated structures (Figure 2b —leftmost spiral). Grande reads also revealed concatemer structures manifested in reads as subsamples of genome-sized repeating units devoid of breakpoint signals (Figure 2b —rightmost spiral). These concatemer structures in Petites with sub-genome sized repeat units are consistent with the existing literature on mtDNA structure in Petites that relied on restriction digestion mapping and electron microscopy (Bernardi et al., 1975; Bernardi et al., 1976; Lewin et al., 1978; Locker et al., 1979; Faugeron-Fonty et al., 1979; Bernardi and Bernardi, 1980).

Figure 2

Download asset Open asset

The mtDNA in yeast exists as concatemers that are delineated by breakpoint signals in sequencing alignments.

(a) A schematic of the definition of alignments and breakpoint signals. Alignments (which are sequences collinear with the reference genome) with their location on the reference are shown as colored arrows alongside the coordinates of alignment edges (a, b) and (c, d). A hypothetical read is shown below these alignments, indicating how the alignments are oriented with respect to each other and the coordinates of the alignment edges in contact that define the breakpoints denoted as arcs. Non-inverted breakpoints represent merged alignments from disjoint locations on the reference that map to the same strand of DNA (red arcs), while inverted breakpoints represent the same disjoint merging of alignments but with different orientation (black arcs). (b) Representative examples of mtDNA structures from sequencing reads in Petite and Grande samples. The top of this panel shows the mitochondrial reference and annotated features of the genome in colored blocks. Below the reference are long sequencing reads wrapped around themselves in a spiral that display the same annotated features as colored blocks. These spiral plots also include red bars which indicate breakpoint locations. Black regions indicate unmapped portions near the ends of the read due to adapters and barcodes. The spiral on the left is a sequencing read from a Petite colony, showing that two origins of replication have been excised from the wild-type genome and tandemly repeated in a concatemer structure. The spiral on the right is a read from a Grande colony showing a portion of a linear segment of the genome, without breakpoints (red bars) except at the ends of the reads which mark the end of alignments. (c) Summary of mtDNA breakpoints detected across 38 Petite colonies, that were derived from 9 spontaneous petite colonies through passaging (above diagonal), and 10 Grande colonies (below diagonal). In this scatter plot, each marker represents the centroid of a cluster of mtDNA breakpoint signals in reference coordinates from reads in a single sample. X and Y coordinates of each marker are the regions of mtDNA that interact to produce the breakpoint. The numbers of breakpoints in each family are indicated in the legend, as well as the numbers of colonies in each family sequenced. Also indicated are the regions on the reference genome that make up the repeating unit in the Petite read shown in (b) and the subsection of the reference genome that contributes to the Grande read in (b).

All Petite colonies contained at least one breakpoint signal and often a diverse set of breakpoints totaling 84 breakpoints across 38 Petite colonies, whereas in Grande colonies sequenced only 2 had high confidence breakpoints detected, with a total of 3 breakpoints across 10 Grande colonies (Figure 2c and Appendix 1—figure 2). The diversity in location of mtDNA breakpoints within Petite families and breakpoint counts greater than the number of members of each family/subfamily also echo the diversity observed in the coverage plots but with more detail. These diverse breakpoint distributions within families indicate either structural diversity in the progenitor colony, continued changes in mtDNA structure resulting in subfamilies or coexisting structures in colonies, or multiple breakpoint signals within colonies indicating more complex mtDNA structures generated by multiple excision events.

What drives mtDNA excision events in Petites to cluster near replication origins?

The prevailing theory for the formation of Petites relies on sequence-specific illegitimate recombination within the wild-type DNA molecule between repeated GC clusters and AT stretches (Bernardi and Bernardi, 1980; de Zamaroczy et al., 1983; de Zamaroczy and Bernardi, 1986), which are prevalent in all noncoding regions of the mitochondrial genome in yeast. In particular, the extensive homology of the eight mitochondrial origins of replication and their inclusion of similar GC clusters (de Zamaroczy and Bernardi, 1986) suggest important regions for illegitimate recombinations. Evidence for hybrid origins resulting from recombination between adjacent origins in the wild-type genome have been seen in restriction digestion data (Marotta et al., 1982). Large structural variations and smaller mtDNA variations have also been observed in Illumina sequencing of Petites to cluster within origins and within close proximity to origins (Osman et al., 2015). Given that replication origins were implicated in previously observed Petite mtDNA excisions and a variety of mtDNA variations, we were curious to understand the involvement of replication origins across the diverse set of excision events we observed. Placing non-inverted breakpoint locations and replication origin locations on mitochondrial reference coordinates reveals the clustering of breakpoints near edges of interacting origins of the same orientation (Figure 3a). In fact, ~30% of breakpoints reside within replication origins, indicating that the structures containing these breakpoints have origins that are perturbed by excisions, or hybrid origins. An additional ~20% of breakpoints reside within 275 bp of the edge of an origin. The remaining 50% of breakpoints are located from 275 bp to 3 kbp from the edge of an origin.

Figure 3

Download asset Open asset

Replication origin and mtDNA excision proximity explained by random excision and selection for origin-dense fragments.

(a) The colocalization of replication origins and alignment breakpoint locations due to excisions. Black dots represent the centroids of breakpoint clusters (see Materials and methods), and blue and green shading highlights replication origins and their orientation. Darker black dots are due to overlaps of breakpoints, indicating high densities of breakpoints at these locations. (b) A cumulative plot of the displacements between breakpoint edges and closest origins of replication, where the blue curve shows this enrichment of breakpoints near replication origins (top left, (c)). The orange curve represents a simulation of uniform random alignments placed on the reference genome following the true alignment length distribution in the data (bottom left, (c)). The black curve represents the simulation of alignments between randomly selected perfect repeats ≥11 bp on the reference sequence (bottom right, (c)). The green curve agrees much better with the data (blue) curve, which is the same simulation of random alignments placed on the reference following the length distribution of the data, but with the requirement that these alignments span some portion of a randomly selected origin of replication (top right, (c)). (c) A schematic of the models plotted in (b). Alignments are denoted as arrows, with distances between alignment edges (breakpoints) and replication origins (dotted boxes) as dimension lines. Circled drawings depict that uniformly random alignments are selected in the orange model, whereas alignments conditioned on spanning replication origins are present in the green model.

Next, we asked if this non-uniform pattern of excision revealed new rules for mtDNA fragmentation. Besides the potential role of homology of the replication origins, it has been noted that a high density of unperturbed replication origins in Petite structures result in a replication advantage for Petite mtDNAs over wild-type mtDNAs (Blanc and Dujon, 1980; de Zamaroczy et al., 1981; Mangin et al., 1983). In concatemer structures, this means that smaller repeated fragments containing replication origins are more fit than wild-type fragments when in competition with each other. This leads to a natural question of whether or not clustering near origins is due to higher frequency recombination within or near replication origins, or if it is due to selection on a pool of arbitrary excisions with selection for the resulting small fragments containing replication origins. To this end, we compared the distribution of displacements between breakpoints and the closest replication origins (Figure 3b, blue curve) to three different models. In the first model (Figure 3b, orange curve), we plotted the same displacement distribution for uniform random mtDNA fragments, with a size distribution given by the sequencing data, placed on the reference mitochondrial genome. In the second model (Figure 3b, black curve), we plotted the displacement distribution for random fragments between perfect repeats of greater than 11 bp in the reference, which is motivated by the fact that excisions require perfect repeats or highly homologous regions. In the third model (Figure 3b, green curve), we plotted uniform random fragments as in the orange curve, with a length distribution from the sequencing data, but conditioned on spanning a portion of a randomly selected origin of replication. A schematic summarizing these models is provided in Figure 3c.

The orange and black distribution captures the breakpoint displacements from origins expected from random excision events and no selection for replication origin containing fragments. Note that the similarity between the black and orange curves demonstrates the prevalence of repeated homology in the mitochondrial reference genome. The green curve captures random excision, but strong selection for small replication origin containing fragments due to the requirement for alignments to contain a portion of an origin in this model. The observed data agrees nicely with the green curve for most of the domain of the distribution. The empirical origin to breakpoint distributions and their agreement with the green model are also consistent across a variety of breakpoint clustering/filtering parameter regimes, which have minor effects on the individual breakpoints extracted from the sequencing data, but little effect on these distributions (Appendix 1—figure 3, Appendix 2—table 2). Thus, while we cannot exclude a model of non-random excisions favoring close origin proximity, the bulk of the minimum breakpoint to origin displacement distributions observed can be explained by random excision and strong selection for small origin containing fragments which is in agreement with the prevailing theory of Petite mtDNA formation.

What is the nature and dynamics of the ongoing excision cascades in Petites?

Next, we were curious to know if we could observe ongoing excision cascades and whether or not subsequent excisions in Petites differed from initial excisions in Grandes that generated the first Petite mtDNAs. In 16 Petite samples sequenced from families {1_a, 1_b, 2, 3, 8_a} there were detectable levels of repeated structures that differed from the ‘primary’ mtDNA structures which span the longest portion of the reference genome and generally contribute to the majority of mtDNA (see Materials and methods on details of structure reconstruction). These lower frequency structures, or ‘alternate’ structures as they will be described from here on, were found to contribute from 0.1% to 59% of total mitochondrial content in these samples. Following multiple passagings of Petite colonies before sequencing, which would rapidly dilute any initially coexisting structures due to mtDNA bottlenecks (Ling and Shibata, 2004), these alternate structures most likely result from subsequent excisions of the primary structure during culturing. Such ‘excision cascades,’ where further excisions act on existing Petite fragments were hypothesized and discussed by Locker et al., 1979; Marotta et al., 1982; Bernardi, 2005, where it was suggested that the varying levels of alternate structures will depend on their generation rate and selective advantage in replication over the primary structure. Part of what makes the dynamics of mtDNA within these cascades interesting is the multiplicity afforded by a large variety and number of potential excisions. Particular excisions that bring regions of homology together may open up entirely new trajectories of excision dynamics that were previously unlikely or inaccessible due to mtDNA conformation.

An extreme example of such an excision cascade is given in Figure 4a, where both primary and alternate alignments are shown on the reference mitochondrial genome location in addition to sequencing coverage. In Figure 4b, the structures of the repeated units composed of these alignments are also shown alongside their calculated mitochondrial content frequencies (see Materials and methods—Primary/alternate structural frequency calculations). The first thing to note is that the locations of alignments extracted from our structural detection pipeline that are involved in repeat units align well with the total sequencing coverage in Figure 4a. This diversity of alignments in Figure 4a is also corroborated by colonies that share the same progenitor. Each of the coverage curves of members of family 1 in Figure 1b share peaks with the coverage curve in Figure 4a. Second, there is significant diversity in the type of repeat units and necessary steps in their generation which stitch together these alignments in Figure 4b: Repeat unit #1 shown in Figure 4b (yellow and pink) is an example of a secondary excision across a segment of mtDNA containing the preexisting primary breakpoint, as both alignments come from opposite ends of the primary alignment and are stitched together. This immediately suggests that the excision occurred in mtDNA in a concatemer form, and across the repeat unit breakpoint (green and purple Type II excision; Figure 4c). Repeat unit #2 (blue, maroon, and green) also spans the primary breakpoint, but has an additional alignment in a different orientation that either resulted from two excisions or recombination of different repeat units (green, purple, and gray Type II excision; Figure 4c). Repeat unit #3 (green) is an example of an excision within the primary repeat unit and away from its edges (Type I excision; Figure 4c). Repeat unit #4 (purple) is an example of a repeat unit that shares only one edge with the primary alignment, but with this edge interacting with a different region of the genome producing a new alignment (Type III excision; Figure 4c).

Figure 4

Download asset Open asset

mtDNA excision cascades, quantification of colony structural diversity, and contingency in subsequent excisions.

(a) Example locations of alignments from mapped reads (linear alignments here are bounded by breakpoints) observed in long reads within Petite sample 1_b. Endpoints of arrows are the mean breakpoint location for the cluster of breakpoint signals that punctuate alignments in repeats. The first panel shows a primary alignment which has the longest span on the genome and exists in long repeats. The second panel shows smaller alternate alignments that exist within detected repeated structures at a lower frequency. Also included in gray is a sequencing coverage map of this sample. (b) Excision cascade in Petite sample 1_b. This plot shows the same primary repeat unit in the first panel, and its contribution to total mitochondrial content as a percentage. The second panel shows the forms of alternate structures present in the same sample which were derived from the primary alignment, alongside their mitochondrial contribution as a percentage. (c) A schematic of the multiplicity of excision events that generate alternate repeat units. The primary concatemer is the red structure, where arrows indicate the alignments (contiguous regions of the reference) that make up the repeating units. The primary breakpoint between these alignments is denoted as a red arc. Colored rectangles above the alignments with the same colors indicate regions of homology in the primary structure that can interact to produce an excision. Dotted rectangles indicate excision sites that produce the alignments shown below them. In the lower half of the figure, five distinct excision events that can generate different repeat units are shown. These are grouped into three excision classes in the data: Type I, where excision occurs within primary alignments, Type II, where excisions span the existing primary breakpoint, and Type III, where excisions share one edge of the primary breakpoint. (d) The frequency of each class of excision across 35 alternate structures detected in the data. (e) A plot of the distance between alternate alignment edges and their closest primary alignment edge across all three classes, normalized by primary alignment length.

In Figure 4c, we provide a schematic of the types of alternate repeat units observed across all samples and the plausible mechanisms of generation. We classify the resulting alternate repeat unit into three easily distinguishable classes in our data: Type I alternate repeat units are regions excised from the interior of primary repeat units. Type II alternate repeats contain or span the primary breakpoint, resulting from an excision across the breakpoint between two primary repeat units in a concatemer form. Type III alternate repeat units share one edge with the primary breakpoint and have a new edge within the primary alignment. For the technical details in the classification of these repeat unit types, see Materials and methods—Type I/II/III repeat unit classification. The proportions of each class of alternate repeats (35 total) across all samples are shown in Figure 4d, where it is clear Type III breakpoints make up the majority (57%) of alternate repeated structures observed across the 16 colonies where we see alternate structures. In Figure 4e, we also plot the distance between the closest primary and alignment edges for each class of repeat normalized by the primary alignment length. It is clear from this figure how close subsequent excisions are to primary breakpoints in the most abundant class, Type III, with a mean and median fractional distance between the alternate and primary edge of 2% and 1%, respectively. The mean and median fractional distances of Type II and Type III repeats are also comparable, which is only expected if Type III repeats truly share an edge of the primary breakpoint; Type II repeats directly recapitulate both primary alignment edges as they contain a perfect copy of the breakpoint. Meanwhile, Type III structures just need to have an edge close enough to either edge in the primary breakpoint to be within the sequencing error that defines the size of breakpoint clusters. So, the fact that both Type II and Type III are comparable in these distances, strongly suggests that Type III structures reuse part of the primary excision site. Therefore, the abundance of Type III repeats indicates a strong preference for secondary excisions at the site that produced the primary alignment itself, that ultimately constrains the trajectories of subsequent excision events in an unexpected, and previously unreported way. It is unclear at this time whether Type II repeats, which encompass intact primary breakpoints, are related to this phenomenon. In general, this pattern of a preference of excisions around or across primary breakpoints is consistent across a variety of clustering/filtering parameters for breakpoints detection (Appendix 1—figure 4, Appendix 2—Tables 1 and 2), the details of which are described in methods.

Are the contentious and rare non-periodic mtDNA structures observed in yeast real?

Seven colonies within family 3 in Figure 1b displayed distinguishably higher variance in coverage than the rest of the Petite colonies sequenced. This variance in coverage suggested either a complex repeat unit which itself contained smaller repeated units, or heterogeneity of mtDNA content in these samples. As such, we were interested to understand the source of this coverage variability. In these colonies, sequencing revealed non-periodic or non-tandemly duplicated primary structures involving partial inverted duplications of sequences. This is in contrast to the repeated units as concatemers that are found in the remainder of the Petite colonies and are primarily in tandemly repeated (non-inverted) forms. These non-periodic structures resemble the ‘mixed’ structures first characterized in detail in Heyting et al., 1979. The structure of one of these colonies is detailed in Figure 5 and is representative of all seven ‘mixed’ structure colonies as they contain indistinguishable alignments and are derived from the same spontaneous Petite colony. In Figure 5a, four alignments of different lengths and reference locations are depicted as arrows. Note that all four alignments share the common region of the red alignment. Also included is the sequencing coverage of this particular colony, which aligns nicely with these alignments extracted from our structural repeat detection pipeline, as well as the ranked length and absolute count of alignments. Correcting for sampling bias (see Materials and methods—Mixed structure alignment frequency calculations) due to the sampled read length distributions across seven colonies with this same structure reveals that each alignment exists in equal proportions in colonies that harbor this structure (Figure 5b). Example structures in long reads selected from one of these samples are provided in Figure 5c, where the mixed structure is evident with seemingly random orientations of alignments. These ‘mixed’ structures are clear examples of intramolecular heterogeneity in mtDNA and likely intermolecular heterogeneity across a population of mtDNA fragments within cells given the differences in the content of the fragments observed. To attempt to make sense of this structure, which is at odds with the concatemer structures observed in all other colonies, we applied the repeat detection pipeline to see if any reads exhibited repeated structures. Interestingly, while superficially, this structure seems devoid of a clear pattern and appears uniformly randomized, across all seven colonies with the structure shown in Figure 5c, we did see some evidence of partially repeated structures, where the same alignments were repeated with the same orientation but separated by other single inverted alignments (Figure 5d). In all seven colonies with these same four alignments, these partially repeated structures are composed of the concatenation of largest and smallest alignments with opposite orientation, or the concatenation of the second-largest and second-smallest alignments.

Figure 5

Download asset Open asset

Non-periodic ‘mixed’ mtDNA structures and their mechanism of generation.

(a) Alignment locations and their raw counts present in the primary structure of a ‘mixed’ repeat Petite colony. (b) Proportions of each four alignments across seven mixed Petite colonies sequenced after accounting for sampling bias (see Materials and methods). (c) Example structures in sequencing reads in this colony, displaying a collection of coexisting isoforms with identical base pair content but varying structures with two distinct inverted duplication breakpoints delimiting alignments. (d) Example partially repeated units detected after observing 1.5 periods in the repeat detection pipeline. (e) Interconvertibility of detected partial repeats. Arrow directions indicate the strand to which alignments have been mapped, in addition to the prime notation on length ranks of each structure which indicates an inverted alignment. (f) Crossover events in the background of concatemers that can produce all breakpoint transitions and structures observed in the data.

The detection of the partial repeats in Figure 5d and the overlapping context in Figure 5a led us to the proposed mechanism of generation provided in Figure 5e–f. With this new evidence of partially detected repeat units, we build upon the crossover mechanism first hypothesized in Bos et al., 1980 for what we believe is the same structure we observed. In Figure 5e, we show how the only detected partially repeat units are interconvertible through a single crossover event relying on interactions of oppositely oriented regions. This crossover event in concatemers or repeat units results in the inversion of the non-repeated sequences, and such an event can produce all of the ‘mixed’ read structures we observed across seven samples (Figure 5f). The generative picture given this proposed mechanism is the following: (1) First a recombination event produces one of the repeat units in Figure 5d, by a crossover mechanism like that suggested in Faugeron-Fonty et al., 1983, or origin-dependent mechanisms like those observed in yeast and proposed in Brewer et al., 2011; Brewer et al., 2015. (2) This repeat unit is amplified, forming a concatemer through rolling circle replication, which exists in this form only transiently. (3) High frequency recombination at the region of shared context which was also suggested by Bos et al., 1980 produces nearly uniformly random orientations of alignments in a concatemer form. This proposed mechanism, and the fact that seven colonies derived from the passaging of one spontaneous Petite all had this ‘mixed’ structure, strongly suggests that cells in these colonies are heteroplasmic in these various structures because they are not readily segregated. As such, this structure represents a unique example of coexisting structural isoforms in the mtDNA of baker’s yeast, that are produced through rapid recombination events that counteract the periodic structures produced by rolling-circle replication and any strong selection for particular configurations of mtDNA.

How is the observed structural heterogeneity of mtDNA in yeast colonies partitioned among individual cells?

Given the evidence of heteroplasmy in ‘mixed’ structure colonies, we were curious to understand the nature of the low frequency alternate mtDNA structures we observed in both Grande and Petite colonies. In the bulk sequencing of colonies, low frequency structures can be contributed by both heteroplasmic cells and mixed populations of cells homoplasmic for primary and alternate structures. In the heteroplasmic limit (Figure 6a), the majority of alternate structure content in a colony is contributed by cells in a heteroplasmic state. One consequence of being in this limit is that biological replicates of colonies would be expected to have low variance in total alternate structure content if heteroplasmy persists. In the homoplasmic limit, the major contribution to total alternate structure content comes from cells solely containing alternate structures (Figure 6b). In this limit, stochasticity in the time of generation of the mutational event would be expected to result in high variance in alternate structure content across biological replicates. Furthermore, in circumstances that enable differential selection on homoplasmic lineages, such as in Petite lineages within Grande colonies in non-fermentable conditions, one would expect the alternate structure content to change as a function of growth conditions if there were homoplasmic contributions.

Figure 6

Download asset Open asset

Evidence of heteroplasmy and homoplasmy in Grande and Petite colonies.

(a) A schematic of the two limits of cellular contributions to mtDNA content in a colony. Left: In the heteroplasmic limit, most of the contribution to total alternate structure content comes from cells containing coexisting alternate and primary structures. Right: In the homoplasmic limit, most alternate structure content comes from cells homogeneous in alternate structure content. (b) The total fraction in bp of reads that include any breakpoint not expected from the primary structure in Grande samples in YPD (fermentable carbon source), YPG (non-fermentable), and Petite samples in YPD. Each dot represents the alternate structure content fraction for a single colony, which is the fractional contribution to total mitochondrial content of reads that contain breakpoints that differ from breakpoints in the primary structure. The box plot displays the median value, and the minimum, maximum, first quartile, and third quartile. Blue vectors indicate heteroplasmic/homoplasmic contributions in Grande colonies in YPD. The orange vector indicates heteroplasmic contributions in Grande colonies in YPG. (c) Contributions of petite families/subfamilies to the Petite YPD alternate structure bp fractions in (b) sorted by variance in alternate structure basepair fractions in each subfamily. The arrow indicates that increasing variance is expected to be accompanied by increasing homoplasmic contributions.

In 10 Grande colonies sequenced, 4 were grown in YPG (non-fermentable media) and 6 in YPD (fermentable media). Only one colony in each growth condition harbored high confidence Petite concatemer structures according to our structural detection pipeline, represented by the two distinct breakpoint clusters in the lower half of Figure 2d. One of these high confidence structures within a YPG colony is shown in Appendix 1—figure 5. With a spontaneous Petite frequency of 10% in the genetic background of the strains sequenced (Dimitrov et al., 2009), these low detection rates are due to our conservative approach to detecting breakpoints. In our pipeline, we require at least three breakpoints from three separate reads to form a believable cluster of breakpoint signals (see Materials and methods). In Grande colonies that produce a diverse set of Petite structures afforded by excisions of the intact WT genome, forming high confidence breakpoint clusters, let alone clusters themselves, is unlikely. Therefore, to compute alternate structure frequencies in this analysis, we abandoned the requirement of breakpoint signals to form a cluster. Instead, we simply counted the total base-pair contribution of reads that included any detected breakpoints internal to the primary alignments as long as they were not accompanied by inverted duplication artifacts which are known to introduce spurious breakpoints due to noise in the latter half of the read (Spealman et al., 2020; Appendix 1—figure 5).

The results of this read enumeration approach (Figure 6b) provide two lines of evidence for heteroplasmy and homoplasmy in Grande cells when comparing mtDNA colony composition under fermentable/non-fermentable media. In Grande colonies grown under non-fermentable conditions (YPG), we argue that the presence of Petite (fragmented) mtDNA in bulk sequencing is contributed by heteroplasmic cells or recent heteroplasmy (orange vector in Figure 6b). This is because in contrast to the heteroplasmic cells containing Grande mtDNA, cells homoplasmic for Petite mtDNA cannot replicate under non-fermentable conditions. With this notion, any observed Petite mtDNA in YPG resides either in cells with Grande mtDNA in a heteroplasmic state or is stuck in homoplasmic Petite cells which are unable to replicate but are a product of recent heteroplasmy. The second argument is that the observed increase in Petite mtDNA fraction that accompanies a switch to fermentable media (YPD) in Figure 6b is predominantly due to homoplasmic cells. Again, since under non-fermentable media homoplasmic Petites are suppressed, any increase in Petite mtDNA once we relax respiration requirements should primarily be due to homoplasmic Petite cells. Thus, these results in Grande colonies suggest that both heteroplasmic and homoplasmic cells (sum of blue vectors in Figure 6b) are contributing to alternate structures in fermentable conditions.

In Petite colonies where all cells regardless of mtDNA content are equally fit in fermentable media, we point to two lines of evidence indicating heteroplasmic and homoplasmic contributions to alternate structure frequencies in Figure 6. The following arguments are based on the median value of the alternate structure frequency that is shown in Figure 6b, and the variance of the same frequencies grouped into Petite families/subfamilies in Figure 6c. The first argument we make for evidence of heteroplasmy in Petite colonies is based on the observation that the median alternate structure frequency in Petites is lower than that of Grandes in Figure 6b. We suggest this is due to stronger out-competition of alternate structures by primary structures in Petites than in Grandes. Because small Petite primary structures replicate much faster than Grandes, alternate structures in Petites are less likely to take hold in this competition. Under this assumption, part of the alternate structure signals must be due to cells in at least transiently heteroplasmic states to enable this competition. Using Figure 6c and known properties of mtDNA transmission, we also argue that most variance in alternate structure frequencies within Petite families is due to stochasticity in the generation of homoplasmic lineages. This is because mtDNA transmission bottlenecks and within-cell selection favor the production of homoplasmic clones containing alternate structures (Ling and Shibata, 2004). However, we also highlight that some families have distinguishably lower variance in alternate structure fractions in Figure 6c (e.g., 2 and 4_b), which is at odds with this hypothesis. As such, these low variance alternate structure fractions in select Petite families appear to indicate persistent heteroplasmic contributions. Thus, as in Grande colonies, we are able to tease apart indications of heteroplasmic and homoplasmic contributions to mtDNA diversity in Petite colonies.

What contributes to the fitness of mtDNA structures, and how does structure inform suppressivity?

The role of origins of replication and GC clusters in mtDNA replication

The mechanism of generation of Petite mtDNAs, as well as our explanation of the distribution of excisions observed in spontaneous Petites, relied on the presence of replication origins. Given their importance in conferring a replication advantage to Petite mtDNAs, we were interested to look for mtDNAs without replication origins that we expected would exist at a low frequency. We were also curious to know if mtDNAs devoid of replication origins had any shared structural characteristics that might explain their propagation. Consistent with the notion that repeated structures with high densities of replication origins have a selective replication advantage over wild-type Grande mtDNA (de Zamaroczy et al., 1981; Bernardi, 2005), 32 of the 47 repeat structures detected across the 16 Petite colonies that contain alternate structures exhibit a higher replication origin content fraction than wild-type mtDNA (Figure 7a). Furthermore, all primary structures contain at least a portion of an origin. However, some of the detected alternate structures encircled in red in Figure 7a have no replication origin content at all. These resemble the ‘surrogate’ replication origin structures described in Goursot et al., 1982 and appear to contain GC clusters in similar configurations to replication origins (peaks above 0.6 GC content in Appendix 1—figure 8), which are known to be important for replication and transcription initiation (Baldacci and Bernardi, 1982; de Zamaroczy and Bernardi, 1986). Consistent with this idea, seven of the nine structures without replication origins are enriched in GC clusters compared to the average GC cluster content of wild-type mtDNA (Figure 7b). Besides the suggested involvement of GC clusters in replication, the enrichment in GC clusters here is also consistent with the observation that GC clusters themselves may be preferred over AT-rich regions as excision sites (Faugeron-Fonty et al., 1979; Gaillard et al., 1980). While structures without replication origins are rare in cultured spontaneous Petites (Goursot et al., 1982), high depth long-read sequencing has provided access to these low frequency structures. The ease of identification of these mtDNA structures through long-read sequencing and accompanying structural inference techniques may prove useful in exploring the minimal sequences required for replication in yeast, as well as low frequency genome diversity in other systems.

Figure 7

Download asset Open asset

The role of replication origins and GC clusters in mtDNA replication.

(a) Replication origin content fractions in primary/alternate structures detected in all samples where both are present. Each dot represents the base pair fraction of any of the eight origins of replication in detected structures. Orange dots are the origin fractions in alternate structures, blue in primary structures, and the green line is the origin content fraction in the wild-type mitochondrial genome. Highlighted by a red bubble are nine alternate structures that are devoid of an origin of replication. (b) Black curves (nine total) represent the cumulative distribution of GC content fraction in a sliding window of 10 bp in the highlighted zero-ori alternate samples. The red curve highlights this same GC distribution but in the wild-type (Grande) mitochondrial reference. The gray region indicates GC content fractions in sliding windows that are consistent with GC clusters found in replication origins (Appendix 1—figure 8). The inset shows the fraction of black lines above the red line as a function of GC fraction in the 10 bp window.

How mtDNA structure informs suppressivity

To understand the rules of competition between wild-type and mutant mtDNA, we measured the suppressivity of all Petite colonies within families (see Materials and methods). Suppressivity is a measure of the fraction of Petite progeny in a cross between each Petite sample and a Grande tester strain. Unlike previous work that studied the relationship between structure and suppressivity in highly suppressive Petites with suppressivity upwards of 90% (de Zamaroczy et al., 1981), our strains exhibit suppressivities from the basal rate of the Petite frequency of the Grande strain at 10%, to ~90%, and these suppressivities correlate well with repeat unit lengths of up to 70 kbp (Appendix 1—figure 9). In contrast, the repeat units in previous work were smaller than 10 kbp. While this difference in repeat unit size was due to the intentional selection of small repeat units in the previous work, distributions of deletion sizes and therefore observed suppressivities have been shown to be dependent on numerous nuclear genes (Bradshaw et al., 2017; Ling et al., 2019). To describe how the structure of mtDNA in our samples informs suppressivity, we developed a phenomenological model (Figure 8a) which assumes each repeat unit is independently competing (we discuss alternate models in Appendix 1—figure 10). The key assumptions of the model that explains the data well are: (a) in mating both Grande and Petite cells contribute equal mitochondrial content, $M$ , which is motivated by the observation of equal Grande and Petite contributions observed in MacAlpine et al., 2001, (b) the number of repeat units initially contributed by Grande and Petite cells during mating is given by $M$ over the average repeat unit length ( $L_{G}$ or $L_{P}$ , $G$ for Grande, $P$ for Petite), and (c) Petite and Grande repeat units replicate independently and exponentially with a replication rate linearly dependent on a mtDNA replication speed ( $ν_{G}$ , $ν_{P}$ ) and replication origin density ( $ρ_{G}$ , $ρ_{P}$ ). The suppressivity is then the fraction of Petite repeat units after a certain competition time ( $t$ ) and is given by the ratio of the time evolution of an exponentially growing population of Petite repeat units $N_{P} = (M / L_{P}) e^{ν_{P} ρ_{P} t}$ to total repeat units $N_{P} + N_{G} = (M / L_{P}) e^{ν_{P} ρ_{P} t} + (M / L_{G}) e^{ν_{G} ρ_{G} t}$ given in equation (1):

S u p p r e s s i v i t y = \frac{1}{1 + \frac{N_{G}}{N_{P}}} = \frac{1}{1 + \frac{L_{P}}{L_{G}} e^{(ν_{G} ρ_{G} - ν_{P} ρ_{P}) t}}

Figure 8

Download asset Open asset

A phenomenological model of suppressivity.

(a) A visual depiction of a phenomenological model of suppressivity. Grande and Petite cells are assumed to contribute equal quantities of mtDNA. It is also assumed that each repeat unit replicates independently and exponentially and that during mating the repeat unit input fractions of Grandes and Petites are inversely proportional to repeat unit length. Exponential growth rates are the product of mtDNA replication speeds and origin densities. (b) Suppressivity of all samples compared to a fit of equation (1), which is the black line. The fit parameters are $ν_{g} t^{*} = 10677$ bp, $ν_{p} t^{*} = 2296$ bp, and the coefficient of determination is $R^{2} = 0.85$ . Dots are the average suppressivity across three second passage Petite colonies that share the same first passage progenitor and belong to the families indicated in the legend (same colored dots share a spontaneous Petite colony progenitor). Y-axis error bars are ± the standard deviation in suppressivities across these three second passage Petites colonies. Samples containing inverted breakpoints in their primary structure are those derived from families 2 and 3, the orange and yellow dots, respectively. Family 3 is the mixed structures described in the text.

The data and least-squares fit of this model, with $ν_{G} t^{*}$ and $ν_{P} t^{*}$ as fit parameters, are shown in Figure 8b. These fit parameters are the products of mtDNA replication speeds in Grandes and Petites with the competition time ( $t^{*}$ ) over which competition of Grande/Petite structures occurs following mating. The repeat unit lengths $L_{G}$ and $L_{P}$ in equation (1) are taken to be the sum of unique alignment lengths in each sample, and $N_{G} / N_{P}$ is the second term in the denominator of equation (1), which is the ratio of the Grande to Petite fragment population. To understand the values of the fit parameters and whether or not they are reasonable, we compare them to equivalent parameters inferred from an exponential growth model of a budding Grande cell: First we assume that the mtDNA competition window ( $t^{*}$ ) is equal to the doubling time in a diploid Grande population of cells (90 min). This is a reasonable assumption, as zygotes generally give rise to their first bud within 90 min in our mating experiments and early zygote dynamics dominate suppressivity results. If we also assume that the exponential replication rate of mtDNA in the Grande cell is the product of replication origin density (1 every 10 kbp in Grandes) and replication speed, then the average mtDNA replication speed in Grande cells is 82 bp/min if mtDNA is duplicated over the cell doubling time. This is of the same order as $ν_{G}$ and is within an order of magnitude of $ν_{P}$ in Figure 8b. We note, however, that these mtDNA replication speeds are coarse grained parameters in the model and should not be compared to directly measured DNA fork velocities which require careful consideration of numerous biological parameters (Hawkins et al., 2013).

With respect to the architecture of the model, a variety of alternative models were also tested in Appendix 1—figure 10, revealing that both exponential growth and a repeat unit input fraction inversely proportional to repeat unit length are statistically important inclusions in improving the model in most regimes. The inverse repeat unit length terms seem to suggest that within yeast zygotes, early competition operates in a repeat unit limit where concatemers are reduced to monomeric forms which then undergo replication. Interestingly, active concatemer to monomer partitioning has been observed during mitosis in yeast (Ling and Shibata, 2002; Ling and Shibata, 2004; Ling et al., 2007), although to our knowledge little is known about the structure of mtDNA during mating and zygote formation. Thus, according to this model, the rules of competition between wild-type and mutant mtDNA in yeast depend on the exponential replication of monomeric forms of mtDNA in zygotes, where replication rates are proportional to replication origin densities in repeat units. This highlights the possibility that Petite mtDNAs may have both a replication advantage and segregational advantage if replication occurs in physically separated repeat units in zygotes.

Discussion

In this article, we studied the dynamics of mtDNA fragmentation in yeast through long-read sequencing and quantified Petite mtDNA fitness through mating experiments. The use of long-read sequencing technology, in conjunction with structural inference methods we developed, which to our knowledge have never been applied to Petite mtDNA in yeast, gave us the ability to reconstruct complex mtDNA structures within populations of growing Petite colonies. This experimental approach enabled us to answer some important open questions about Petite mtDNA formation and propagation. On Petite mtDNA generation, we discovered contingency as a driving force behind mtDNA excision dynamics where previous fragmentation sites seed new events. This along with evidence for the generation of non-periodic ‘mixed’ mtDNA structures shows the power of our approach to understand structural variants and their dynamics. On Petite mtDNA propagation, this article reinforced that within cell (intracellular) selection plays a key role in the fragmentation dynamics of Petite mtDNA. A replicative advantage for mtDNA fragments with a high density of replication origins explains why mtDNA excisions tend to cluster near origins and was a critical component of the biophysical model of mtDNA fitness we developed. Both intracellular and cell-level (intercellular) selection also helped explain the distribution of altered mtDNAs among cells in a colony.

Building upon previous work, which alluded to ongoing mtDNA fragmentation in Petites, we provided direct evidence for this fragmentation in Petite colonies and discovered that subsequent mtDNA fragmentation is contingent on previous fragmentation. The presence of various levels of heterogeneity observed within Petite strains indicated by non-primary sub-stoichiometric bands in the restriction digests of Petite mtDNA (Bernardi et al., 1976; Lewin et al., 1978; Lewin et al., 1979; Locker et al., 1979; Marotta et al., 1982) suggested that the excision mechanism was ongoing, continuously producing lower complexity Petite structures in a hypothesized excision ‘cascade’ (Locker et al., 1979; Bernardi, 2005). Here, we have demonstrated unequivocal evidence of secondary excisions that operate on primary structures in sequenced colonies, and unlike the previous work, were quantitative in computing the frequencies of these coexisting structures. We also showed, for the first time, that subsequent excisions are contingent on previous excisions that produced the primary structure in these colonies. This apparent preference for the reuse of existing excision locations constrained the fate of structures formed through subsequent excisions. The reuse of excision sites highlights a tension between contingency and repeatability in the formation of new Petite mtDNA structures. It also seems to suggest that the breakpoints in primary Petite structures are persistent instabilities in mtDNA, perhaps akin to structures like R-loops (Holt, 2019) that may promote strand invasion and recombination at or near these sites. Exploration of the nature of these instabilities in the mtDNA of yeast remains an interesting direction for future studies.

Then we established that previously hypothesized non-periodic ‘mixed’ mtDNA structures are real and indeed non-periodic, which is at odds with the structures of most Petite mtDNAs observed. Hints of non-periodic or non-tandemly duplicated structures (‘mixed’ structures) have been commented on previously in ethidium bromide treated Petites (Locker et al., 1974; Locker et al., 1979) and spontaneous Petites (Heyting et al., 1979; Bos et al., 1980; Faugeron-Fonty et al., 1983). The first proposal of a model for the generation of these structures was provided in Bos et al., 1980, but was subsequently refuted in Faugeron-Fonty et al., 1983,- where the claim was that these were larger ranging periodic structures produced by an unknown mechanism. However, our evidence of partially repeated units in Figure 5d, where two alignments are repeated with the same orientation but separated by an inverted alignment, precludes the structure proposed in Faugeron-Fonty et al., 1983. Thus, this structure is indicative of rapid recombination within mtDNA concatemers that opposes the homogeneity produced by rolling-circle replication and segregation through bud bottlenecks, which results in a collection of coexisting structural isoforms of mtDNA. While coexisting concatemers of various lengths and forms have been observed in yeast mtDNA with the same repeat units (Locker et al., 1979), as well as coexisting isoforms in plant mitochondria (Kozik et al., 2019), the ‘mixed’ structures we have observed are a rare glimpse of this phenomenon in yeast that provide a unique example of persistent intramolecular and intermolecular heterogeneity. These structures are also interesting from the perspective of reverse excision events thought to partition concatemers into monomers during bud formation (Ling and Shibata, 2002). It is unclear how a monomer should be defined in these ‘mixed’ samples, and therefore how it is partitioned, given that circular repeat units are the predominant species in new buds.

Our quantitative analysis, despite using bulk sequencing, allowed us to address an important unaddressed question of how mutant mtDNAs were distributed among cells within colonies. Persistent heterogeneity in Petite colony mtDNA was observed previously as sub-molar restriction digestion fragments (Lewin et al., 1979). These fragments were persistent across biological replicates but seen to disappear and reappear with varying intensities during subcloning. While their persistence indicated heteroplasmy, the varying intensities pointed toward clonal divergence through segregation into homoplasmic clones. Similarly, we argued that most of the variation in mtDNA structure we observed within colonies is likely due to homoplasmic clones in Petites, but with hints of heteroplasmic contributions in a few examples. Our observation of alternate structures in Grande samples under non-fermentable growth conditions is a direct indicator of heteroplasmy which is usually difficult to resolve from coexisting homoplasmic clones in bulk sequencing data. Recently, a variety of single-cell sequencing approaches have been adapted for use in yeast (Jariani et al., 2020; Urbonaite et al., 2021; Dohn et al., 2021) that are poised to enable direct observations of mtDNA heteroplasmy in yeast cells like they have in humans (Maeda et al., 2020). These tools will also provide an opportunity for quantification of mtDNA heteroplasmy (Lareau et al., 2021) which remains a promising direction for future work.

The inferred fine structures of Petite mtDNA from long reads allowed us to develop a phenomenological model for how structure informs fitness measured through the suppressivity of Petite samples. The relationship between suppressivity and mtDNA structure was explored in de Zamaroczy et al., 1981, which provided two general rules: (1) Partial deletions or rearrangements of origins of replication, including inversions of fragments containing origins, reduce suppressivity, and (2), suppressivity is inversely proportional to repeat unit length. This was followed with an observed exception to the second rule (Rayko et al., 1988), which suggested that flanking regions also influenced suppressivity.

In agreement with these rules, we showed that intact replication origins are indeed enriched in both primary and alternate Petite structures compared to wild-type, and that surrogate origins are present in rare alternate structures devoid of canonical origins. We also showed that selection drives mtDNA excision events in Petites to cluster near replication origins. The colocalization of mtDNA excisions and replication origins we observed is consistent with a recent study using short-read sequencing (Osman et al., 2015). The study of Osman et al., 2015 suggested that the colocalization of origins and structural mtDNA variations may be due to replication origins themselves being recombination hotspots, resulting in preferential excisions at these locations. However, we demonstrated through comparisons to excision models that selection for small origin-containing fragments following excisions throughout the entire genome can explain the empirical excision distribution. At the same time, although we cannot see smaller mtDNA variations like SNPs in the present study, rare single-base changes at inferred excision sites in Petite strains have been observed close to origins in de Zamaroczy et al., 1983 and in Osman et al., 2015. This may mean that like the larger structural variations we observe in Petites, small variations may also be observed to be clustered near replication origins just by virtue of these regions being strongly selected for, rather than preferential mutations at these locations. It is also important to note that the mtDNA recombination landscape in wild-type yeast (Fritsch et al., 2014) also varies significantly from the excision distribution we observed. However, we attribute these differences to the selection of origin containing fragments in Petites, as well as differences between homologous recombination (between DNA molecules) and sequence-specific illegitimate recombination (within individual DNA molecules) responsible for excisions.

Consistent with the role of repeat unit length in defining Petite mtDNA fitness, the most predictive model of suppressivity was in limit which assumes that both Grande and Petite mtDNAs are independently replicating in their monomeric repeated units. We note, however, that the observed size distribution of concatemers between Grandes and Petites has been highly variable. Haploid Petite cells have been found to have both larger and smaller average molecular sizes than Grandes depending on the strain, and in both cases, harbor a pool of concatemers of various sizes (Locker et al., 1979). This seems to suggest that in haploid cells, as opposed to zygotes which we consider in the model, mtDNA competition may not be operating in the repeat unit limit. Nevertheless, if the numbers of independent mtDNA concatemers in haploid cells regardless of their size are inversely proportional to repeat unit length, this would be consistent with our model. In fact, in new buds, which all cells start as, concatemer to monomer partitioning has been observed in experiments (Ling and Shibata, 2002; Ling and Shibata, 2004; Ling et al., 2007). So even if during replication each monomer expands to different sized concatemers, the numbers of independent mtDNAs can still be inversely proportional to repeat unit length.

Given the emphasis in our suppressivity model on repeat unit size, but also notable outliers such as family 2 in Figure 8b, it is also possible that effective repeat unit sizes are dictated by the possible secondary structures for a given concatemer. Suppressivities in Petite family 2, which contain tandemly repeated inverted dimers, deviate most from the theoretical curve. Inverted sequences like those in family 2 would also be expected to form the hairpin structure hypothesized in the generative model of the mixed repeats. If this hairpin persists, it will consume directly repeated regions that are preferred in crossover events. The result would be a reduction in the density of repeated regions accessible to excision events, which upon eventual fragmentation would produce a larger effective repeat unit. This may in part explain the lower suppresivity of Petite family 2 from the theoretical value. Overall, more data is likely required to aggregate these apparent exceptions or outliers into a more encompassing model of suppressivity, but the present study provides a foundation of modern techniques and lessons to build upon in this goal.

Considering the diversity and destructiveness of the Petite mutations that this study revealed, it is also worthwhile to comment on the tolerance of the Petite mutation in yeast populations and why their mtDNA might have evolved to produce a structure susceptible to such destruction. Lab strains of S. cerevisiae, like the one investigated in this study, generally have higher Petite frequencies than feral yeast strains due to a collection of nuclear mutations (Dimitrov et al., 2009). However, feral yeast strains of S. cerevisiae still have highly repetitive mtDNA that is susceptible to excision, albeit at a lower frequency. A natural question is then, why did evolution yield such a structure? It has been suggested that the addition of repetitive origins and surrogate origin sequences may have conferred a replicative or transcriptional advantage to the wild-type genome (de Zamaroczy and Bernardi, 1986; Bernardi, 2005). It is also possible that high rates of recombination enabled by this repetition is advantageous for genetic complementation. Another possibility of having a highly recombinant DNA and machinery is for the destruction of invasive foreign DNA. Finally, population-level selection for respiring yeast cells also likely played a central role in opposing the negative effects of this mtDNA instability in populations, helping maintain intact mtDNA in Petite-positive yeast over evolutionary timescales. As such, mtDNA dynamics and the Petite mutation in S. cerevisiae is a wonderful example of how multilevel selection can shape the evolutionary trajectories of genomes.

Finally, we comment on the applicability of the findings in this study to other organisms. We motivated in this article that long-read sequencing and the structural inference methods we developed were able to reconstruct complex coexisting mtDNA structures in yeast colonies. This methodology will also be beneficial for the exploration of other systems that contain complex and repetitive mtDNA structures. A promising area of use is in plants, where complex mtDNA isoforms have been shown to coexist within cells (Kozik et al., 2019).

Interestingly, it turns out that the same components of the process that lead to mtDNA deletions in yeast—recombination followed by excision, selection, and persistent instability—also lead to mtDNA deletions in the human tissues. These mtDNA deletions in humans have been seen to accumulate during aging in skeletal muscle and brain tissue (Fayet et al., 2002; Kraytsberg et al., 2006; Payne and Chinnery, 2015) and are also associated with a variety of diseases including Parkinson’s (Chan, 2006; Bender et al., 2006). Like in yeast, excisions of mtDNA due to recombination between repeated homology have been suggested to be the cause of large mtDNA deletions in humans (Guo et al., 2010). Remarkably, the ‘common deletion’ in humans, which is a ~5 kbp deletion delimited by interacting 13 bp repeats, appears to be most frequent because of selection for replication origins. While numerous repeats exist in human mtDNA, and result in a multitude of deletions associated with disease, the ‘common deletion’ retains both replication origins unlike lower frequency deletions (Samuels et al., 2004). This suggests that the most common mutant mtDNA propagated in human cells is also governed by the same type of intracellular selection for replication origins that drives the Petite mutation. Persistent mtDNA instabilities in human mtDNA, which are suggested to be due to mtDNA content inducing replication fork stalling, have also been observed to create recombination hotspots and colocalize with mtDNA deletions (Kraytsberg et al., 2004; Phillips et al., 2017). This type of instability studied in humans is precisely the type of event that may help explain the contingency in mtDNA fragmentation we observe in yeast.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers
Strain, strain background (Saccharomyces cerevisiae)	W303	GenBank: JRIU00000000.1	MATa/MATα leu2-3,112 trp1-1 can1-100 ura3-1 ade2-1 his3-11,15
Strain, strain background (S. cerevisiae)	yCO362	Boris Shraiman lab at UCSB/GenBank: JRIU00000000.1	MATa W303 leu2-3,112 can1-100 ura3-1 ade2-1 his3-11,15
Strain, strain background (S. cerevisiae)	SY2081	Grant Brown lab at UofT/GenBank: JRIU00000000.1	W303 MATα leu2-3,112 can1-100 ura3-1 ade2-1 his3-11,15 trp1-1
Strain, strain background (S. cerevisiae)	10T3	This study	W303 MATα leu2-3,112 can1-100 ade2-1 his3-11,15 trp1-1
Commercial assay or kit	Qiagen 20/G Genomic-tip	QIAGEN	Cat. no./ID: 10223
Commercial assay or kit	MinION Mk1B with Starter Pack	Oxford Nanopore	Starter Pack(Flow Cell FLO-MIN106 R9.4.1)
Commercial assay or kit	EXP-NBD104 and EXP_NBD114Native barcoding expansion	Oxford Nanopore	EXP-NBD104EXP_NBD114
Commercial assay or kit	SQK-LSK109Ligation sequencing kit	Oxford Nanopore	SQK-LSK109
Commercial assay or kit	AMPureXP purification and cleanup kit	Beckman Coulter	A63881
Software, algorithm	Minimap2	Li, 2018	Minimap2

Parameter	‘Lenient’ parameter set value	‘Selected’ parameter set value	‘Strict’ parameter set value
Minimum alignment length	100 bp	300 bp	1 kbp
Insertion/deletion threshold	30 bp	30 bp	30 bp
DBSCAN epsilon (minimum clustering distance)	100 bp	1 kbp	1.4 kbp
Minimum read support	1	3	5
Majority voting?	No	Yes	Yes

Share this article

Cite this article

Overview of the experiment and observed mtDNA diversity in sequenced yeast colonies.

The mtDNA in yeast exists as concatemers that are delineated by breakpoint signals in sequencing alignments.

Replication origin and mtDNA excision proximity explained by random excision and selection for origin-dense fragments.

mtDNA excision cascades, quantification of colony structural diversity, and contingency in subsequent excisions.

Non-periodic ‘mixed’ mtDNA structures and their mechanism of generation.

Evidence of heteroplasmy and homoplasmy in Grande and Petite colonies.

The role of replication origins and GC clusters in mtDNA replication.

A phenomenological model of suppressivity.

Inverted duplication artifacts are prevalent in Nanopore sequencing of mitochondrial DNA but exhibit patterns that enable their detection.

Separated non-inverted/inverted circular breakpoint plots.

Distribution of minimum distances between alignment edges and origin edges is robust to different structural detection pipeline parameter regimes.

The preference for primary structure excision sites is robust to different filtering parameter regimes.

Structures observed in Grande cells grown in YPG—evidence of heteroplasmy.

Inverted duplication artifact breakpoint edges are further separated than expected due to reduced base calling accuracy.

Alternate structure frequencies when not neglecting reads containing inverted duplication artifacts.

GC clusters in origins and alternate structures without origins.

Supressivity and its correlation with repeat unit size.

Comparison of suppressivity models.

Effect of repeat detection (RD) and majority voting scheme (MV) on breakpoint filtering.

The necessity and effect of recursive Minimap2 alignments in resolving Petite structure.

Structural detection pipeline parameter sweeps—The effect of majority voting, minimum read support, and DBSCAN clustering radius, insertion/deletion size threshold, and minimum alignment length.

Example read length distribution from sequencing of a single Petite strain and its exponential fit.

Structural detection pipeline overview and summary of parameters.

Definitions of parameter regimes referenced in Appendix 1—figure 3 and Appendix 1—figure 4.

Author details

Christopher J Nunn

Contribution

For correspondence

Competing interests

Sidhartha Goyal

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism