Introduction

Cells and entire organisms can tolerate a surprisingly large number of deleterious mutations, especially when they affect gene function partially rather than completely, for example, when only one of two alleles is inactivated. In fact, a typical human germline cell has as many as about one hundred inactive genes, usually complemented by active alleles (MacArthur et al., 2012). Somatic cell lines in healthy tissues can add dozens of new (nearly always heterozygous) mutations every year (Moore et al., 2021). Understanding how this high burden of DNA damage translates into a relatively low mutational load, defined as a reduction in fitness, has proven to be a real challenge (Henn et al., 2015). One possible explanation is that mutations typically have negligible effects, especially when heterozygous. Another is that deleterious mutations are balanced by compensatory mutations, although the latter may not yet have been identified. However, it is also possible that the negative effects of mutations are significantly reduced when they are numerous, i.e. they effectively offset each other through positive epistasis.

Because the fitness architecture of complex organisms is particularly difficult to disentangle, researchers have turned to simpler systems. In the model organism Saccharomyces cerevisiae, collections of engineered single-gene mutations, such as complete deletions, have been created and used to estimate selection and dominance coefficients, as well as the effects of interactions between a few (typically two) loci (Costanzo et al., 2010; Giaever et al., 2002; Steinmetz et al., 2002). It would be much more difficult to combine large numbers of deleterious mutations. This inevitably long process would expose the strains under construction to a high probability of developing fitness-restoring genetic adaptations, which are known to occur even with limited cell proliferation (Szamecz et al., 2014). A rapid way to generate genomes with extensive loads of known mutations is needed. With this in mind, we noted that the budding yeast genome consists of approximately five thousand genes, distributed in a largely random manner across 16 chromosomes. We reasoned that monosomy, the loss of one chromosome in an otherwise regular diploid (2n-1), would mimic the sudden mutation (partial inactivation) of numerous and randomly assembled genes.

Results

Viability of monosomic strains

We began our study by inserting three markers into one chromosome within each homologous pair. These were two URA3 genes, each at the center of every arm, and one copy of kan near a centromere (Tutaj et al., 2022). Since its homolog was unmarked, mutants lacking the markers signaled a possible loss of the marked chromosome (Materials and Methods). We expected that such generated true monosomic strains (M1, M2, …, M16) would show visible growth defects, which we indeed observed, except for M1 (Fig. 1a). The loss of markers and slow growth were only hints, and often turned out to be false when DNA sequencing revealed that the expected loss of an entire single chromosome did not occur, or that it did occur but was accompanied by other changes. Fig. 1b shows isolates that passed the sequencing test. As shown in Fig. 1c, small colonies formed by monosomic cells were regularly invaded by much larger ones. For each monosomic, dozens of the latter were re-streaked and found to reproduce the fast-growing phenotype almost universally. To prevent the monosomics from overgrowing by the fast-growers, we had to keep the cultures small, down to 5 µl of YPD medium in the case of M5 or M15. The cultures had to be replicated (in tens or hundreds) so that a sufficient number of those dominated by the slow-growing cells could still be found and replenished.

Diploid yeast strains lacking single chromosomes (monosomics).

(a) Representative colonies of wild-type diploid (WT) and monosomic mutant strains (M1…M16) when grown together for 24 hours at 30 °C on agar-solidified YPD medium. (b) DNA whole genome sequencing coverage. (c) Examples of large colonies emerging among the small monosomic ones.

The procedure described above could not be completed for three chromosomes-VII, XII, and XIII-although we paid special attention to them. Regarding the screens for M7 and M13, we did not find a single mutant that lacked the markers, grew slowly, and could rapidly revert to a normal phenotype. Instead, we regularly encountered isolates of the desired genotype with no growth defect and a diploid genome. Hypothetically, two mitotic crossovers within a pair of homologous chromosomes could produce an offspring cell without the markers. We were able to reject this possibility based on the results of the present study and a previous study (Tutaj et al., 2022). Relevant data and calculations are provided in the Supplementary Text. The best explanation is that the marked chromosomes VII and XIII were lost and their unmarked homologs underwent endoreduplication shortly thereafter. The endoreduplication could result rom non-disjunction of a monosomic chromosome. Thus, we suggest that the cells of M7 and M13 were viable but suffered from very slow growth, high rate of endoreduplication, or both, and were thus especially rapidly overgrown by revertants. The search for M12 yielded unusually numerous and genotypically variable isolates, but none that were correct in terms of growth phenotype or DNA copy number mapping (Materials and Methods, Supplementary Text). Chromosome XII contains the only yeast rDNA region and is a strong recombinational hotspot (Tutaj et al., 2022). Therefore, we suspended attempts to derive M12, especially since, as we show below, its predicted fitness loss from haploinsufficiency was well within the range covered by other chromosomes.

In the wild, monosomic isolates have only been found for a few and the shortest chromosomes (Peter et al., 2018). Similar examples of monosomy have been detected in mutation-accumulation experiments (Sui et al., 2020; Zhu et al., 2014). It is uncertain whether these monosomies were simply tolerated, compensated by other changes, or conditionally adaptive in certain environments. There have been laboratory experiments suggesting that all monosomies can be induced in yeast, though indicated only by loss of heterozygosity at selected loci, but they may have been transient (Beach et al., 2017; Reid et al., 2008). We suggest that past work on yeast monosomy should be taken with a grain of salt and that future work should be properly planned. Colonies of putative monosomics may actually be diploids or aneuploids. Even if they were initially truly monosomic, they could easily have been overtaken by endoreduplications or other compensations for chromosome loss. This could happen even if a moderately sized colony was formed from an initially monosomic cell. Here, we combined the marker-based approach with genome-wide assays based on sequencing of DNA (reported above) and RNA (below) to confirm that nearly all yeast monosomics are viable.

Epistasis for fitness

Before studying epistasis between multiple mutations, their individual effects must be known. With respect to yeast growth rate, virtually none of the single gene deletions can increase it (Sliwa and Korona, 2005). About 1/5 of them stop growth altogether and another 1/5 slow it down detectably (Giaever et al., 2002). These estimates are based on homozygous or haploid strains under optimal laboratory conditions. Heterozygous deletions, again under good conditions, often show morphological changes, but their growth rate, i.e., proliferative fitness, is rarely and only modestly affected (Deutschbauer et al., 2005; Marek and Korona, 2016; Ohnuki and Ohya, 2018). In Materials and Methods, we explain how we selected 468 heterozygous single deletion strains as potentially (haplo)insufficient for growth on the rich medium, YPD. We performed repeated growth rate measurements for single deletion strains and compared them to a set of carefully selected control strains to make our estimates as accurate and reliable as possible. The doubling rates of individual heterozygous deletions were divided by that of the control to obtain relative doubling rates, rDRs (Datasheet 1) The frequency distribution of rDR is shown in Fig. 2a. It shows that the negative effects of the deletions were undoubtedly present, although predictably small, a substantial fraction of them were in the range of bidirectional effects, most likely composed of phenotypic plasticity and measurement error. (The existence of phenotypic variation not attributable to gene deletions is revealed by the variation within a genetically homogeneous control).

Contribution of positive epistasis to fitness of monosomics.

(a) Growth performance of heterozygous single-gene deletion strains tested in this study. Left: list of chromosomes with numbers of assayed deletion strains. Right: frequency distribution of rDR (doubling rate related to that of the control). (b) Growth performance of monosomics. Colored bars represent expected performance calculated as a sum of the single-gene effects per chromosome, rDRE=1+∑di, where di=rDRi−1 (deviation from the control). Gray bars show the observed performance of monosomic strains, rDRM. White arrows mark the expected departure from wild-type fitness (expected genetic load), gray the observed one. Black arrows show the extent and direction of epistasis. Three monosomics were not included in these assays (see the main text).

Epistasis is absent when an effect of the gene variant does not depend on the genetic content of other loci. In the case of fitness, a non-neutral mutation corresponds to a proportional change in wild-type fitness (Crow and Kimura, 1971). The absence of epistasis means that the fitness quotient of a given mutation remains unchanged regardless of whether it is the only one present in an individual or accompanied by other fitness-affecting mutations. Accepting this postulate is equivalent to adopting a multiplicative model of fitness structure, in which the fitness of a genotype involving n loci is the product of n respective quotients. The model is additive for log-fitness because such a transformation turns a quotient into a deviation from one and a product of quotients into a sum of such deviations. Since we are working with a (log2) transformed measure of fitness, the deviation caused by an i-th gene deletion is equal to di = rDRi−1. The expected effect of multiple loci is the sum of all deviations involved, rDRE=1+di. To predict how each of the 16 monosomic strains should perform, we summed the deviations caused by the individual deletions present on each chromosome (obtaining 16 values of rDRE). We summed both negative and positive values of d to account for non-genetic variation of estimates. We then compared the predicted values of rDRE with the corresponding values of rDRM, that is, estimates obtained experimentally for the actual monosomic strains (Datasheet 1). Fig. 2b shows the expected and obtained values of monosomics’ rDR, where epistasis is equal to the difference between them. The epistasis between multiple deleterious gene deletions turned out to be not only positive but also large, much larger than that observed for just pairs of deletions (Jasnos and Korona, 2007). Most strikingly, some monosomic strains were expected to have a negative doubling rate, i.e. were predicted to be effectively lethal, but turned out to be able to proliferate. For several monosomics, and especially M4, most of the predicted mutational load was canceled out by epistasis. The contribution of epistasis to fitness seemed so large that it might be difficult to accept without a functional rationalization.

Transcriptome reaction

Epistasis can be considered in a purely abstract way, as a deviation from additivity/multiplicity. But it must have a biological explanation. Our attempt to find it began with the isolation and quantitative analysis of mRNA from the ancestral diploid strain and eight monosomic strains. (The selection was random. It nevertheless resulted in the inclusion of strains in which the level of epistasis was low, M1 and M8, or high, M2 and M16.) The first question was obvious: was the decrease in gene dosage compensated by increased expression? Fig. 3a shows that there was no detectable increase in the average intensity of transcription on the monosomic chromosomes. These averages depend on hundreds of genes and would therefore be largely insensitive to increased expression of important but relatively few genes. In this experiment, we knew which genes would be most rewarding to upregulate, those that were most haploinsufficient, and could focus our attention on them. Fig. 3b shows that the fitness effect of a single gene-deletion did not correlate with the expression of this gene in a monosomic strain. Thus, neither physical underrepresentation (single copy) nor functional importance (impact on growth) triggered an increase in expression of the genes on monosomic chromosomes: their mRNA level was halved on average although individual genes could show either up- or downshifts.

Absence of transcriptional compensation in monosomic strains.

(a) Halved RNA production on monosomic chromosomes. For every ORF, the obtained number of RNA-seq reads was divided by the number expected for it under expression being constant over an entire genome. (b) Expression under monosomy vs. single-deletion fitness (rDR). X-axis shows the length of a monosomic chromosome with centromeres marked as circles and gene deletions as bars; colors show the effect of monosomy on the level of a particular mRNA with a particular color showing a range of log2 fold change (FC) relative to the control. Y-axis: the difference in rDR between a single gene deletion strain and the control, d=rDR-1. The correlation between fitness effect (rDR-1) and shift in expression (log2FC) is reported as Spearman’s coefficient rs with associated and p-value (not corrected for the multiplicity of comparisons).

Monosomy could have altered the expression of any genes, not just those on the affected chromosomes. We aimed to find any distinct and functionally interpretable patterns of adjustment that might explain how the functioning of monosomic strains was perturbed. We found multiple statistically significant up- and downregulations in the transcriptomes of each monosomic strain analyzed (Datasheet 2). Both parallelisms and incongruencies between them were visible.

Fig. 4a shows the similarities. The translational apparatus of the cytosol was on average significantly and universally upregulated, as evidenced by transcripts encoding proteins that build both large (e.g. RPL28) and small (RPS29B) ribosomal subunits. Cytosolic proteolysis was downregulated (core proteasome component PRE6). In remarkable agreement, chaperones required to fold newly synthesized peptides were upregulated (SSB1/2), while those required to direct destabilized chains to degradation were downregulated (SSA1, HSP82, HSP104). Transcripts encoding mitochondrial proteins were generally less abundant, with a marked decrease in the expression of several genes encoding the electron transport chain (Datasheet 2). In parallel, the expression of the antioxidant machinery was downregulated, especially that of a major ROS scavenger (SOD2).

Parallel and divergent shifts in transcriptomes of monosomic strains.

Heat maps show monosomic mRNA frequencies divided by respective diploid (control) ones. (a) Gene Ontology categories selected to demonstrate similarities in transcriptional profiles of monosomic strains. (b) Regulons demonstrating differences in gene expression between monosomic strains. Expanded versions of all panels can be found in Supplementary Figure S1.

To find transcriptomic differences, we examined groups of genes, each responding to a known signal. This increased the chances of detecting statistical significance and functional divergence. In our experiment, all strains received identical external signals. Therefore, different expression patterns would indicate different internal perturbations. Indeed, as shown in Fig. 4b, substantial variation was detected within several regulons: cAMP-PKA (glucose-activated signaling), NCR (nitrogen catabolism repression), GAAC (general amino acid control), and Msn2up (activation by a broad range of stresses). These differences underscore the significance of the above reported uniformity in the pattern of biosynthetic upregulation and proteolytic downregulation.

All of the above considerations refer to the relative abundance of mRNA species. We also attempted to compare the absolute size of the transcriptome in wild-type BY and three monosomic strains: M1, M2, and M3. We added an admixture of specific external mRNA to provide a quantitative reference, or “spike”, to known total cell volumes of each strain and repeated our assays. The results are shown in Supplemental Figure S3. In summary, the spike represented 4.04% of total mRNA for the BY sample and 6.08, 15.6, and 24.2% for the respective monosomic strains. Thus, monosomy would be associated with a decrease in the absolute level of mRNA. A possible caveat is that the monosomic cells had an altered morphology. They were more rounded in shape and approximately 60% larger in long axis, so that their individual volumes were several times larger than wild-type. Thus, the size of the transcriptome for M1 and M2 would not be reduced but actually increased if calculated per whole cell (one nucleus). Indeed, one explanation for the observed epistasis for viability could be an ample overproduction of all transcripts, so that even those halved by monosomy are sufficiently abundant. We consider the cell volume calculations more appropriate and therefore accept that the transcripts from monosomic chromosomes were even less abundant than half of the wild strain abundances. The apparent decrease in the cellular density of the transcriptome in the cytoplasm of monosomics is not paradoxical because the RNA content, including ribosomal protein mRNAs, decreases with slowing growth (García-Martínez et al., 2016; Warner, 1999). However, even if one accepts that monosomic transcriptomes were expanded (per nucleus), they would still be unbalanced. Thus, if the lack of stoichiometry between proteins and the resulting low efficiency of ribosome assembly, rather than the simple scarcity of specific elements, is the major consequence of monosomy, it would remain so regardless of the absolute size of the transcriptome.

Discussion

The successful derivation and maintenance of monosomic strains demonstrated that it is possible to obtain an a priori designed set of strains with hundreds of partially inactive genes, dozens of which are known to have deleterious fitness effects. It implies that the ability to remain viable despite carrying extensive mutational burdens is not limited to specific combinations of genes, but would likely apply to a wide variety of them. Viability was supported by the epistatic effect, which was of positive value and remarkable strength, but may still be underestimated in our study. It was shown that many of the single deletions with negative fitness effects had undergone compensatory evolution under recurrent replication needed to maintain the strains carrying them (Puddu et al., 2019). Since we used the same collection of deletions, it is likely that some of them were at least partially compensated. All of these effects that may have been missing in the individual deletions were present in the newly derived monosomic strains. This caveat actually strengthens our claim that the positive epistatic component can be substantial and even offset most of the total damage associated with the combined individual negative effects.

The advent of systems biology has raised hopes that the “statistical” and “biological” sides of epistasis can be coherently brought together (Moore and Williams, 2005; Phillips, 2008). Indeed, a truly functional explanation of the measured growth effects would require an understanding of how the material and energetic expenditures were altered by the same mutations when they were separate or combined. Such a metabolomic interpretation could only be attempted semi-quantitatively and for much simpler systems (Molenaar et al., 2009). Regarding proteome and transcriptome, the latter has the advantage of higher accuracy and repeatability, the ability to detect the products of even weakly expressed genes and, crucially for this study, to provide clear signals of major functional transitions such as fast-slow growth, anabolism-catabolism, presence-absence of stress. The relationship between the transcriptome and the proteome is not straightforward in multicellular organisms, where compensation for the perturbations introduced by aneuploidy is often observed (Birchler and Veitia, 2021). In yeast, most of the changes observed in mRNA are generally reflected as corresponding shifts in both the profile of translation and the composition of mature proteins (Larrimore et al., 2020; Pavelka et al., 2010). The relationship is not perfect as, for example, disomics have a somewhat attenuated average expression of proteins encoded on the duplicated chromosome. But even there, the result is still much closer to doubling than to parity (Dephoure et al., 2014). Post-transcriptional attenuation can be more frequent and pronounced when only single genes are doubled, but this is a very different case from ours (Ascencio et al., 2021). Crucially, while signs of compensation at the proteome level are found in wild aneuploids, they are much less pronounced in the laboratory strains and especially in the BY used here (Messner et al., 2023; Muenzner et al., 2024). (It is tempting to suggest that the wild aneuploids represent a biased sample of aneuploid mutants, i.e., they either already possessed means of compensating for the distortions introduced by aneuploidy or were able to evolve them and thus avoid purging by selection.) In summary, although mRNA analysis does not provide a complete description of the metabolism of the laboratory-generated monosomic cells, it does provide valuable information about it.

The simultaneous upregulation of genes encoding ribosomal proteins (RP) and downregulation of those encoding subunits of the proteasome was observed in all monosomic strains, providing a crucial insight into the functional response of the cell to monosomy. The absence of a single chromosome meant that several genes encoding the translational machinery that resided on it were halved in dosage, while dozens of others remained unaffected. The cell could not restore the required stoichiometry by overexpressing the affected genes. Favorable environmental conditions signaled that translation needed to be intensified, but functioning ribosomes were in short supply, resulting in seemingly indiscriminate overproduction of ribosomal proteins and withholding of degradation of both them and other cytosolic proteins. The response was inadequate and costly, but it should be seen as an attempt at ad hoc rebalancing rather than a prepared (evolved) response. This interpretation, strongly suggested by the transcriptomic data, appears even more plausible when analyzing the functional diversity of the 468 genes qualified as haploinsufficient. Table 1 shows that the diversity was ample: of the one hundred Slim GO Biological Process categories, dozens could be linked to each chromosome-associated subset of deletions. However, there was a significant common motif. Processes involved in the “protein synthesis apparatus” (our term) were most frequently represented. Their predicted damage was also particularly high, typically sufficient or exceeding that required to reduce the growth rate to the levels actually observed in monosomics. The biosynthetic perturbation was thus real, and its relationship to the observed transcriptomic response was likely causal, not merely coincidental.

Yeast Slim GO Biological Process categories of the tested deletions and the predicted and observed relative doubling rate of the monosomic strains.

Some of the other adaptations seen in the monosomal transcriptomes could also be related to the dominant effect of translational inefficiency. A signal to increase biosynthesis would mobilize fermentation and reduce oxidation, which would downregulate genes encoding ROS-scavenging proteins. The Mns2-mediated and other stress responses were apparently absent, contrary to previous reports (Sheltzer et al., 2012). The hallmarks of stress responses include decreased RP production and increased proteolysis (to remove destabilized proteins). Here, these two processes were driven in opposite directions, blurring the expected patterns. The non-homogeneity of the response in other regulons analyzed - cAMP-PKA, NCR, GAAC - means that functions other than ribosome assembly were severely affected, but differently, depending on the composition of genes made insufficient by the loss of a chromosome (see also Datasheet 2). If there were so many different piecemeal responses, why was the uniform positive epistasis observed?

A possible answer begins with the assumption that the cell is an aggregate of multiple functional modules (Hartwell et al., 1999). Our monosomic strains had many different modules that were negatively affected by partially inactivating mutations. These mutations did not interact with each other directly, but rather through the modules to which they belonged. Mutations that affected the most critical module(s) suppressed the negative effect of mutations that affected other, non-limiting modules. The latter were either less damaged or less needed under current conditions. The epiphenomenon of positive epistasis for fitness reflected the fact that not all mutations exerted their negative effects, at least not to the full extent. In our case, this is not just speculation, but rather the simplest and most prudent explanation linking growth rate and transcriptomic data. Our experimental system is special in a way: a microbial cell in which several modules have been compromised but exposed to favorable conditions and thus tested for the single capacity to grow rapidly. Although peculiar, the system is also instructive. Any unicellular organism is typically tested for only a few specific condition-dependent functions defining fitness, which are very different under, e.g., growth or starvation. Similarly, the specialized cells of a complex organism are required to support only a few, and different, elements of life. The modularity of the cell helps to understand how so many partial genetic defects can be carried by so many different organisms without drastically reducing their fitness: rarely or never are all the defects really important.

Interactions within pairs of gene knockouts have received much attention (Costanzo et al., 2020; Segrè et al., 2005; Szappanos et al., 2011). Epistasis for fitness arising from interactions between at least several mutations has also been studied but mostly in the context of “diminished returns” of successive beneficial variants (Barrick et al., 2009; Chou et al., 2011; Khan et al., 2011; Kryazhimskiy et al., 2014). It has been proposed that the pattern may arise from interactions of mutations through a ‘global’ phenotypic interface, such as fitness (MacLean et al., 2010; Perfeito et al., 2014; Schoustra et al., 2016). However, a similar overall effect could also be produced by “idiosyncratic” interactions without any mediating factors (Lyons et al., 2020; Reddy and Desai, 2021). Indeed, this has been demonstrated experimentally, at least for a selected and moderately large set of mutations (Bakerlee et al., 2022). Finally, negative epistasis of beneficial mutations has been attributed to cellular modularity by postulating that the positive contribution of each functional module to fitness must have its upper bound (Wei and Zhang, 2019). Given that the negative epistasis between multiple positive effects can arise from different mechanisms, we do not insist that the explanation of positive epistasis between multiple negative effects we propose is the only possible one. However, it may be particularly applicable when the deleterious mutations are truly numerous and distributed across many cellular subsystems that work as functional modules.

Materials and Methods

Single gene deletion strains

1.1. Selection of single gene deletions with a possible effect of haploinsufficiency

Deutschbauer et al. have assayed a complete collection of heterozygous single gene deletion strains and identified a total of 184 genes, 98 essential and 86 non-essential, as haploinsufficient for growth in rich medium, YPD (Deutschbauer et al., 2005). We included this set of genes in the present study. Using a different technique, Marek et al. have tested al1 142 essential heterozygous single-gene deletions and 946 non-essential deletions selected as likely not neutral for growth (Marek and Korona, 2016). We reviewed the latter study and, using a false discovery rate of 0.15, accepted up to 404 genes, 256 essential and 148 non-essential, as potentially haploinsufficient. The two sets obtained in two different studies, 184 and 404, overlapped in 112 cases. The overlap was much higher than expected, 14.3, if the two sets were just random samples from among 5,200 yeast genes. On the other hand, it was limited and suggested that new growth assays were desirable. The new assays are described below; they included all the unique strains identified in the two studies minus 5 that were not present in our current strain collections and another 3 that were dropped as superfluous for the orthogonal design of experimental blocks described below. (They were also the 3 least promising based on the earlier assays.) Datasheet 1 lists the strains.

1.2. Control for single deletion strains

To correctly quantify the predictably small negative growth effects introduced by heterozygous single deletion strains, an unbiased and accurate estimate of a wild-type phenotype is required. In the present study, we attempted to achieve this by using multiple strains as controls rather than a single strain. The reason for this was that the gene deletions used here were constructed over several years by several laboratories and then treated with repeated rounds of propagation, which could lead to genetic divergence. We felt it was risky to use one strain as a control for all the others. We looked for a group of strains that were indistinguishable in terms of growth rate, suggesting that they had not acquired genetic changes that affected this trait. In the case of non-essential genes, we searched the Saccharomyces Genome Database and found 25 ORFs originally labeled “dubious”, which are currently almost certainly spurious ORFs, located between other ORFs and no closer than 100 bp to their START or STOP codons. Heterozygous strains carrying deletions of these genes were tested for doubling rate in a manner specific to this study (described below). After repeated assays, 16 deletion strains that were closest to the medium growth performance and not statistically different from each other were selected as controls. Knowing that the non-essential and essential strains differ somewhat in their origin and subsequent handling (Brachmann et al., 1998), we decided to derive a separate set of control strains for the latter. We selected 32 essential genes that were in the very center of a single, strong and narrow modal peak of the frequency distribution of estimates collected for the entire collection of heterozygous essentials (Marek and Korona, 2016). Again, after repeated tests of growth rate, 16 strains that were closest to the median and not different from each other were selected to serve as the final control for essential gene deletions. Control strains from both groups are listed in Datasheet 1.

Monosomic strains

2.1. Parents of monosomic strains

In an earlier study, we used 32 strains that had a counter-selectable marker (URA3) near the center of each of the 32 chromosome arms and a drug resistance marker (kan) near a centromere (Tutaj et al., 2022). Pairs of strains with the same centromere marker and the counter-selectable markers located on either the left or right arm of the same chromosome were mated with each other. The resulting diploids were sporulated and tetrads dissected to get triple-marked haploid genotypes, URA3-kan-URA3. The latter were mated with a standard BY haploid strain of the opposite mating type. In the final set of 16 diploid strains, each strain had one chromosome triple-marked while its homolog and the remaining 15 chromosome pairs were isogenic with the diploid strain BY4743.

2.2. Derivation and verification of monosomic strains

The diploid strains with triple-marked chromosomes were grown overnight in synthetic complete medium (SC) and then 50 to 500 µl samples of the resulting cultures were plated on standard 5-FOA (5-fluoroorotic acid) plates. Emerging colonies were transferred to new 5-FOA plates and YPD plates with 200 mg/ml geneticin. The goal was to identify variants that were able to grow on the 5-FOA plates but not on the geneticin plates, indicating that all three marker genes may have been lost along with an entire chromosome. Colonies identified in this way were often of different sizes, suggesting that they were genetically heterogeneous. The next criterion was to find variants that produced colonies that were visibly smaller than those of the parental strain, but rarely, though regularly, produced colonies similar to those of the parental strain. Such strains were grown in replicate small cultures (5 to 100 µl, depending on the monosomic strain), tested for negligible frequency of cells forming large colonies, and collected in larger samples that allowed isolation of DNA in quantities sufficient for next-generation sequencing. Monosomy was considered confirmed when the number of reads for the entire length of a single and expected chromosome was halved. Although simple, this protocol required multiple attempts for some chromosomes because few colonies tended to appear on the 5-FOA plates, most of them remained resistant to geneticin, and those that passed these two criteria did not show the required reversibility, that is, the tendency to occasionally return to normal growth in an apparent step. Even when all these phenotypic criteria were met, sequencing occasionally revealed genomes other than those of pure monosomic origin, that is, with the whole and only one chromosome removed. Nevertheless, once confirmed, the monosomic strains could have been reliably propagated on rich and synthetic media, including that containing 5-FOA, as long as the recurrent appearance of fast-growing colonies was monitored and counteracted.

2.3. Control for monosomic strains

The monosomic strains were all derived by us and the derivation involved our stock of haploid strains BY4741 and BY4742. These two were then crossed to produce a diploid BY4743 that lacked the URA3MX4 and kanMX4 cassettes and was used as a control in the growth assays of the monosomic strains.

Estimation of DR and rDR

The collection of single gene deletion strains was arrayed on 6 flat-bottomed 8×12 well titration plates with 150 µl aliquots per well. Within each plate, the first and last wells contained clean YPD medium, rows 3 and 10 contained control strains, and the deletion strains occupied the remainder of the plate. Plates were filled with either essential or non-essential deletions accompanied by the 16 control strains listed above. One plate contained both essential and non-essential deletions along with 8 essential and 8 non-essential control strains. Plates were inoculated at 1-5% from thawed samples and kept non-agitated at 30°C for 48 hours until they reached approximately similar densities of stationary phase cells. Such conditioned microcultures were used to inoculate plates with fresh YPD at 0.5% and maintained at 30°C with shaking at 1,000 rpm. Cultures were analyzed for OD (600 nm) every 0.5 hours using a TECAN Infinity reader. Four independent replicates of measurements were performed, starting with independent conditionings. OD readings were used to calculate DR (doubling rate). To obtain the rDR (relative doubling rate), each DR estimate of an experimental strain was divided by the average DR of the control strains present on the same plate. In the case of the monosomic strains, the entire protocol was analogous except that all experimental and control strains were kept in one plate. Cultures of monosomic strains used in this assay were tested to contain less than 1% fast-growing cells at the time of OD measurements.

Analysis of DNA and RNA

The first step in preparation for both DNA and RNA analysis was to collect samples of monosomic cells that would be nearly free of the rapidly growing revertant cells. Individual monosomic strains were grown as replicate microcultures (5 to 100 µl) in YPD at 30°C to stationary phase. The latter were serially diluted and plated to test for the appearance of large colonies indicating the appearance of compensatory mutations. The microcultures in which more than 99% of the colonies were typical of a particular monosomic strain were pooled. Such tested stationary cultures of monosomic cells were directly used to extract DNA as template for high coverage sequencing (PE 150, expected read depth ~80). The resulting reads were mapped along standard yeast chromosome sequences using bowtie2 (Langmead and Salzberg, 2012), along standard sequences of yeast chromosomes: Ensembl release 100, S. cerevisiae genome R64-1-1. Duplicate reads were marked with MarkDuplicates (Picard Toolkit 2019. Broad Institute, GitHub). Samtools (1.15.1) was used for BAM files sorting, indexing and coverage analysis (Danecek et al., 2021).

For RNA, purity-tested cultures of monosomic cells were transferred to fresh YPD and incubated for 4 hours at 30°C with agitation. Total RNA was then extracted using the RiboPure™ RNA Purification Kit. Three replicates of the monosomic and control BY4743 strain were prepared in this manner. Library preparation and PE 150 sequencing were performed by Novagene. Approximately 20 million read pairs were generated per sample. Quality control of the reads was performed using fastQC v0.11.9 (Andrew, 2010). RNA reads were aligned to the above standard sequence using Hisat2 v2.1.0 (Kim et al., 2015). The resulting alignment files were sorted and indexed with samtools (1.9). Transcript quantification was performed using cuffquant/cuffnorm v2.2.1 (Trapnell et al., 2012). Gene count data normalization (“TMM” method) and differential expression analysis (exact test) were performed in the EdgeR test (Robinson et al., 2010).

For transcriptome analysis with the addition of the mRNA spike, cultures were conditioned as above. The density of the cultures was estimated by counting the cells in the Burker’s chamber. The cells were also photographed under a light microscope and the long and short axes of the cells were measured for about 50 randomly selected cells. Using the formula for ellipsoids and assuming that the two shorter axes are equal, the volumes of individual cells and their sums were calculated. Equal amounts of the mRNA spike (ERCC RNA Spike-In Mix, TermoFisher) were added to all cell samples from which RNA was extracted using hot formamide (Shedlovskiy et al., 2017). Subsequent library preparation, sequencing and analysis were performed as described above.

Data availability

All data generated or analyzed during this study are included in this published article (and its supplementary information files) or are deposited in the BioProject and GEO (Gene Expression Omnibus) public databases. Accession numbers can be provided upon request.

Acknowledgements

We thank J. Bobula for her experimental assistance.

Additional information

Funding

National Science Centre of Poland grant NCN 2017/25/B/NZ2/01036 (RK)

National Science Centre of Poland grant NCN 2014/13/B/NZ8/04668 (KT)

Jagiellonian University grant DS/MND/WB/INoS/10/2018 (HT)

The open-access publication of this article was funded by the programme “Excellence Initiative Research University” at the Faculty of Biology of the Jagiellonian University in Kraków, Poland.

Author contributions

Conceptualization: HT, RK

Experimental design: HT, KT, RK

Experiments: HT, AP, MM, RK

Analysis: HT, KT, RK

Visualization: HT

Writing – original draft: RK

Writing – review & editing: HT, RK

Competing interests

The authors declare no competing interests.

Supplementary Materials

Supplementary Text 1 - Expected fitness effect of multiple mutations

Fitness is the number of offspring divided by the number of progenitors, w=No/Np. This can be the number of cells left by one cell (including itself in the case of budding cells) over a unit of time. Assume that an organism carries multiple mutations—α, β, … ω—which are located in heterozygous loci, their wild-type counterparts are universally marked with +. The fitness effect of a single mutation is wα/+, and so on. Fitness can be converted to relative fitness, i.e., expressed as a quotient of the wild-type fitness, wα/+/w+/+, and so on. Under the multiplicative model of mutation accumulation, an expected joint effect of multiple mutations on relative fitness is a product of individual quotients:

When a population is growing continuously, a log transformation of fitness is typically applied because it equals the rate of growth. In particular, it could be the number of doublings completed in a unit of time:

After replacing the log multiplicative formula above with its log additive equivalent, all terms of the latter can be normalized by dividing by log2 fitness of the wild-type, which transforms them into relative doubling rates, for example, rDRα/+=log2(wα/+)/log2(w+/+). The combined effect of multiple mutations is then equal to

or

where d = rDR-1 is an individual mutation effect on the relative doubling rate (see Fig. 2B in the main text).

Supplementary Text 2 - Transient monosomy of chromosomes VII and XIII

The procedure for isolating monosomics described in Materials and Methods was applied in the same way to all strains with marked chromosomes, including both VII and XIII. For both strains, we tended to obtain urine colonies (on 5-FOA agars) that were mostly large (wild-type size), with small ones accounting for a few percent of all isolates. In the search for potential monosomics, we first concentrated on the small colonies. Out of as many as 100 isolates for each parental strain, not a single one met our criteria. Most were kanR, and none of the rest produced small and largely uniform colonies that would tend to revert to wild-type growth. Despite these discouraging signs, two of the most promising candidates were sequenced for each chromosome. These analyses revealed complex polyploidies rather than regular monosomy.

We then examined the dominant large Ura- colonies. They appeared at a mean frequency of 3.1E-06 and 6.3E-06 for chromosomes VII and XIII, respectively. (Number of colonies divided by the number of cells plated on 5-FOA.) For each parental strain, 108 colonies collected from several separately inoculated selective agars were randomly selected for further testing. All of these isolates grew rapidly when reseeded. Only 2 of the isolates from the strain carrying the marked chromosome VII proved to be kanR, none in the case of chromosome XIII. To interpret these results, it may be useful to recall that the URA3 marker was placed in the middle of both the left and right chromosomal arms, while kan occupied one of the ORFs closest to a centromere:

Its homolog was free of all three markers. In principle, two mitotic crossovers between telomere and URA3 on each arm could produce a progeny cell with two marker-free chromosomes. It would be a regular diploid and thus have a wild-type growth rate. There are two arguments against this proposal. First, double crossovers between URA3 and kan would be similarly frequent, and therefore cells lacking URA3 but retaining kan would not be as rare as observed here. Second, in our previous study(Tutaj et al., 2022), we measured the frequency of crossover on the two chromosomes. Based on our estimates, two simultaneous crossovers located within the distal halves of the arms of chromosomes VII and XIII would occur at a rate of 1.1E-08 and 7.9E-09, respectively, which is about two orders of magnitude lower than observed.

The fast-growing and marker-free isolates were then subjected to sporulation and tetrad dissection. For each chromosome, 8 isolates were randomly selected and at least 10 tetrads were dissected. In each of the 16 strains analyzed, tetrads with four viable haploids formed a clear majority, as expected for the products of meiosis with one regular diploid cell. Two of the 8 strains analyzed in this way were further tested by high coverage DNA sequencing. No sequences of the two markers or the MX4 cassette containing them were found. The three loci in question contained only the expected wild-type sequences with a coverage typical for the rest of the genome, i.e. diploid (Supplementary Figure S2).

In summary, we did not find a single slow-growing monosomic strain for chromosomes VII and XIII, despite our intensive search. The regularly occurring fast growing isolates with the phenotype we were looking for were most unlikely to be the result of recombination events such as a double crossover. They were normal diploids, with the chromosomes in question carrying the phenotypic alleles and DNA sequences characteristic of the unmarked homologs. These results suggest that the marked chromosomes were lost and their unmarked homologs underwent endoreduplication.

The unsuitability of chromosome XII for monosomy research

The parent strain with the triple-labeled chromosome XII tended to produce unusually numerous colonies on the selective 5-FOA agars, mostly as large as those of the regular diploid strains. We did not analyze these large colonies because we considered them likely to be produced by recombination and thus unrelated to monosomy. This chromosome contains the only yeast rDNA region that is about 1 Mb long and is known to be a strong recombination hotspot. Therefore, the quantitative arguments for excluding double crossovers developed for chromosomes VII and XIII were not applicable here. Small colonies were also more numerous than in other strains. The marker and growth rate reversibility tests were applied to several hundred of them, but failed in almost all cases. A few of the most promising isolates were sequenced and none of them showed the regular monosomy we were looking for. We decided to stop our search for M12 at this point because the presence of rDNA made it impractical.

Parallel and divergent shifts in transcriptomes of monosomic strains.

Heat maps show monosomic mRNA frequencies divided by respective diploid (control) ones. (a) Gene Ontology categories selected to demonstrate similarities in transcriptional profiles of monosomic strains. (b) Regulons demonstrating differences in gene expression between monosomic strains. (This figure is an expanded version of Fig. 4 in the main text.)

DNA whole genome sequencing coverage after the postulated endoreduplication.

Two isolates descending from the parental diploid strains with marked chromosomes VII or XIII are shown. They were subjected to sequencing after being found to lack phenotypic markers and produce four viable spores.

Correlation of counts of individual mRNAs between wild-type and monosomic strains.

Counts are expressed as fractions of either wild-type BY or monosomic M1, M2 and M3 total transcriptomes. Gray circles represent mRNAs from the unaffected 15 chromosomes and group around the diagonal. Blue represent spike mRNAs. Red circles represent mRNAs from the monosomic chromosomes (I, II or III in the respective graphs). Note that the monosomic counts are, as expected, underrepresented in the respective monosomic strains (red circles are below the diagonal). Monosomic counts of spike are higher than that of BY (blue circles are above the diagonal). As reported in the main text, the total fraction of spike counts in BY is 4.04%. Analogous sums for M1, M2 and M3 are 6.08, 15.6 and 24.2%. This can be seen here as an increasing distance between the gray and blue circles.