Introduction

A typical human germline cell has as many as one hundred inactive genes, usually complemented by active alleles (MacArthur et al., 2012). Somatic cell lines in healthy tissues add dozens of new (nearly always heterozygous) mutations every year (Moore et al., 2021). Understanding how this burden of DNA damage translates into the mutational load, defined as a decrease in fitness, has proved to be demanding, not only in the case of humans but also simpler organisms. Estimates of selection and dominance coefficients of single mutations tend to be imprecise, any quantification of interactions between multiple mutations (epistasis) appears even more challenging (Henn et al., 2015). The difficulty with the gene interactions is not just technical but primarily methodological. Any comprehensive research on epistasis would require that the studied combinations of mutations represented an unbiased sample of all possible. The actual combinations encountered in the extant organisms (cells) do not satisfy this requirement because they represent samples distorted by natural selection. For example, the discovery that the individuals whose fitness is negatively impacted by epistasis are underrepresented or even absent cannot be unequivocally interpreted. Either the negative interactions are originally rare or organisms affected by them are eliminated before being scored. Similarly, the presence of multiple and undoubtedly deleterious mutations in an otherwise well-performing organism may mean that either their negative effects tend to alleviate each other or that some yet-unrecognized genetic compensations have developed.

As the fitness architecture of naturally occurring organisms is difficult to disentangle, researchers turned to simpler systems. In the model organism of Saccharomyces cerevisiae, collections of engineered single-gene mutations, such as complete deletions, have been created and used to estimate the selection and dominance coefficients as well as the effects of interaction between a few (typically two) loci (Costanzo et al., 2010; Giaever et al., 2002; Steinmetz et al., 2002). It would be much more difficult to combine large numbers of harmful mutations. This unavoidably long process would expose the strains under construction to a high probability of developing fitness-restoring genetic adjustments which are known to arise even under limited propagation (Szamecz et al., 2014). A rapid way to generate genomes with extensive loads of known mutations is needed. With this in mind, we noted that the budding yeast genome is composed of roughly five thousand genes that are split among 16 chromosomes in a mostly random fashion. We reasoned that monosomy, the loss of one chromosome in an otherwise regular diploid, 2n-1, would mean sudden mutation (partial inactivation) of numerous and arbitrarily assembled genes.

Results

Viability of monosomic strains

We began our study by inserting three markers into one chromosome within each homologous pair. These were two URA3 genes, each at the center of every arm, and one copy of kan close to a centromere (Tutaj et al., 2022). As its homolog was unmarked, mutants lacking the markers signaled a possible loss of the marked chromosome (Materials and Methods). We expected that such generated genuine monosomic strains (M1, M2, …, M16) would exhibit visible growth defects which we indeed observed, except for M1 (Fig. 1a). The loss of markers and slow growth were only hints and proved false on several occasions when sequencing of DNA revealed that the expected loss of a single chromosome did not happen, or it did but was accompanied by other changes. Fig. 1b presents isolates that passed the sequencing test. As Fig. 1c shows, small colonies formed by monosomic cells were regularly infested by much bigger ones. For every monosomic, at least dozens of the latter were re-streaked and found to reproduce the fast-growth phenotype nearly universally. To prevent them from overgrowing the monosomics, we had to keep the cultures small, down to 5 µl of the YPD medium in the case of M5 or M15. The cultures had to be highly replicated so that a sufficient number of those dominated by the slow-growing cells could be still found and recurrently replenished.

Diploid yeast strains lacking single chromosomes (monosomics).

(a) Representative colonies of wild-type diploid (WT) and monosomic mutant strains (M1…M16) when grown together for 24 hours at 30 °C on agar-solidified YPD medium. (b) DNA whole genome sequencing coverage. (c) Examples of large colonies emerging among the small monosomic ones.

The described above procedure could not be completed for three chromosomes—VII, XII, and XIII—even though we paid especially much attention to them. Regarding screens for M7 and M13, we did not find a single mutant that lacked the markers, grew slowly and could rapidly revert to a normal phenotype. Instead, we regularly encountered isolates of the searched genotype without a growth defect and with a diploid genome. Hypothetically, two mitotic cross-overs within a pair of homologous chromosomes could produce one progeny cell without the markers. We were able to reject this possibility based on the results of the present study and a former one (Tutaj et al., 2022). Relevant data and calculations are provided in Supplementary Text. The only viable explanation is that the marked chromosomes VII and XIII had been lost and their non-marked homologs underwent endoreduplication shortly after that. Thus, M7 and M13 were probably not inviable but prohibitively unstable. (They could suffer from very slow growth, a high rate of endoreduplication, or both.) Searching for M12, we obtained unusually numerous and genotypically variable isolates but none correct in terms of growth phenotype or DNA copy number mapping (Materials and Methods, Supplementary Text). Chromosome XII contains the only yeast rDNA region and is a strong recombinational hotspot (Tutaj et al., 2022). Therefore, we suspended attempts to derive M12, especially since, as we show below, its predicted burden of haploinsufficiency was well within the range covered by other chromosomes.

We sought to include all monosomic in our assays in order to test whether monosomy can occur and persist in absence of other genetic changes. In the wild, monosomic isolates have been found only for a few and the shortest chromosomes (Peter et al., 2018). Similar scope of monosomy has been detected in mutation-accumulation experiments (Sui et al., 2020; Zhu et al., 2014). It is unsure whether these monosomies were simply tolerated, compensated by other changes, or conditionally adaptive, especially in natural environments. There have been laboratory experiments signaling that all monosomies can be induced in yeast but this likely happened only transiently as these studies relied on screening for the loss of heterozygosity at selected loci only (Beach et al., 2017; Reid et al., 2008). Thus, former work on yeast monosomy should be taken cautiously and the future one planned properly. Colonies of presumed monosomics can be actually diploids of complex aneuploids. Those initially truly monosomic are readily overtaken by endo-reduplications or other compensations of the chromosome loss (within even a moderately sized colony). Here we combined the marker-based approach with genome-wide tests based on sequencing DNA (reported above) and RNA (below) to confirm that pure, genetically non-compensated, yeast monosomics are viable.

Epistasis for fitness

Before studying epistasis among multiple mutations, their single effects should be known sufficiently well. Regarding the yeast growth rate, effectively none of single gene deletions can make it faster (Sliwa and Korona, 2005). About 1/5 of them stop growth altogether and another 1/5 decelerate it detectably (Giaever et al., 2002). These estimates refer to homozygous or haploid strains under optimal laboratory conditions. Heterozygous deletions, again under good conditions, exhibit morphological alterations quite often but their growth rate, i.e., proliferative fitness, is affected rarely and only modestly (Deutschbauer et al., 2005; Marek and Korona, 2016; Ohnuki and Ohya, 2018). In Materials and Methods, we explain how we selected 468 heterozygous single deletion strains as those likely (haplo)insufficient for growth on rich medium, YPD. We applied replicated growth rate measurements for individual deletion strains and compared them to a set of carefully selected control strains in order to make our estimates possibly precise and reliable. Doubling rates of individual heterozygous deletions were divided by that of the control to yield relative doubling rates, rDRs (Datasheet 1) The frequency distribution of rDR is shown in Fig. 2a. It demonstrates that the negative effects of deletions were undoubtedly present although predictably small, a sizable fraction of them was on the order of the bidirectional effects of the phenotypic plasticity and/or measurement errors (as exposed by the variation within genetically homogeneous control).

Contribution of positive epistasis to fitness of monosomics.

(a) Growth performance of heterozygous single-gene deletion strains tested in this study. Left: list of chromosomes with numbers of assayed deletion strains. Right: frequency distribution of rDR (doubling rate related to that of the control). (b) Growth performance of monosomics. Colored bars represent expected performance calculated as a sum of the single-gene effects per chromosome, rDRE=1+∑di, where di=rDRi−1 (deviation from the control). Gray bars show the observed performance of monosomic strains, rDRM. White arrows mark the expected departure from wild-type fitness (expected genetic load), gray the observed one. Black arrows show the extent and direction of epistasis. Three monosomics were not included in these assays (see the main text).

Epistasis is absent when an effect of a gene variant does not depend on the genetic content of other loci. In the case of fitness, a non-neutral mutation corresponds to a proportional change of wild-type fitness (Crow and Kimura, 1971). An absence of epistasis means that the fitness quotient of a particular mutation remains unchanged irrespective of whether it is the only one present in an individual or accompanied by other fitness-affecting mutations. Accepting this postulate is equivalent to adopting a multiplicative model of fitness structure under which the fitness of a genotype involving n loci is the product of n respective quotients. The model is additive for log-fitness because such a transformation turns a quotient into a deviation from one and a product of quotients into a sum of such deviations. Since we operate with a (log2) transformed measure of fitness, the deviation caused by an i-th gene deletion is equal to di = rDRi−1. The expected effect of multiple loci is the sum of all deviations involved, rDRE=1+di. To predict how each of the 16 monosomic strains should perform, we summed deviations caused by the individual deletions present on each chromosome (acquiring 16 values of rDRE). We summed both negative and positive values of d to account for the mentioned above non-genetic variation of estimates. We then compared the predicted values of rDRE with the respective values of rDRM, that is, estimates obtained experimentally for the actual monosomic strains (Datasheet 1). Fig. 2b shows the expected and obtained values of rDR, epistasis is equal to a difference between them. Epistasis between multiple deleterious gene deletions turned out to be not only positive but also large, much larger than that observed for just pairs of deletions (Jasnos and Korona, 2007). Most strikingly, some monosomic strains were expected to have a negative doubling rate, i.e. were predicted to be effectively lethal, but turned out to be able to proliferate. For several monosomics, and especially M4, most of the predicted mutational load was canceled by epistasis. The contribution of epistasis to fitness appeared so high that it might be difficult to accept without functional rationalization.

Shifts in transcriptome profiles

Although epistasis can be viewed in a purely abstract way, a deviation from additivity/multiplicity, it must always have its biological explanation. Our attempt to find it started with the isolation and quantitative examination of mRNA from the ancestral diploid strain and eight randomly chosen monosomic strains. The first question was obvious: was the deficiency in dose compensated by elevated expression? Fig. 3a shows that there was no detectable increase in the intensity of transcription on the monosomic chromosomes. The linear relation between the number of copies of a chromosome and the number of produced transcripts has been reported for yeast, though most of the available data pertain to aneuploids with excessive rather than lacking chromosomes (Larrimore et al., 2020; Torres et al., 2007). Hence, neither our study nor earlier ones revealed compensation at the level of chromosome-scale averages. Those averages are dependent on hundreds of genes, mostly dose-insensitive, and therefore they would be insensitive to an increased expression of the genes which were important but relatively few. In this experiment, we knew for which genes upregulation would be most rewarding and could focus our attention on them. Fig. 3b shows that they were not transcribed at elevated rates. Thus, neither physical underrepresentation (single copy) nor functional importance (impact on growth) triggered transcriptomic compensation.

Absence of transcriptional compensation in monosomic strains.

(a) Halved RNA production on monosomic chromosomes. For every ORF, the obtained number of RNA-seq reads was divided by the number expected for it under expression being constant over an entire genome. (b) Expression under monosomy vs. single-deletion fitness (rDR). X-axis: the effect of monosomy on a particular mRNA level was calculated as log2 fold change relative to the diploid control and displayed as a color-coded bar at its chromosomal position (the width of a box reflects chromosome length). The shifts in expression at the monosomic chromosomes were calculated after taking into account that the gene dose was halved there. Y-axis: the difference in rDR between a single gene deletion strain and the control, d=rDR-1. Pairs of estimates were tested for rank correlation; shown are Spearman’s coefficients and p-values (not corrected for the multiplicity of comparisons).

The availability of entire transcriptomes enables examination of the cell functioning far beyond testing for compensatory expression at the affected loci. We aimed at finding any distinct and functionally interpretable patterns of adjustments that could explain why the monosomics performed better than expected. We found multiple statistically significant up- and downregulations in transcriptomes of every analyzed monosomic strain (Datasheet 2). Both parallelisms and incongruencies between them were visible. Fig. 4a shows the similarities.

Parallel and divergent shifts in transcriptomes of monosomic strains.

Heat maps show monosomic mRNA frequencies divided by respective diploid (control) ones. (a) Gene Ontology categories selected to demonstrate similarities in transcriptional profiles of monosomic strains. (b) Regulons demonstrating differences in gene expression between monosomic strains. Expanded versions of all panels can be found in Supplementary Figure S1.

The translational apparatus of the cytosol was on average considerably and universally upregulated, as evidenced by transcripts coding for proteins building both large (e.g. RPL28) and small (RPS29B) ribosomal subunits. Cytosolic proteolysis was downregulated (core proteasome constituent PRE6). In a remarkable accord, chaperones needed to fold newly synthesized peptides were upregulated (SSB1/2) while those required to direct destabilized chains to degradation were downregulated (SSA1, HSP82, HSP104). Transcripts coding for mitochondrial proteins were generally less abundant, with a marked decline in the expression of several genes coding for the electron transport chain (Datasheet 2). At the same time, the expression of antioxidative machinery was dampened, especially that of a chief ROS scavenger (SOD2).

To find transcriptomic differences, we examined groups of genes each reacting to a known signal. This raised chances for detecting statistical significance and functional divergence. In our experiment, all strains received identical outer signals. Therefore, different expression patterns would indicate different internal distortions. Indeed, as Fig. 4b demonstrates, substantial variation was detected within several regulons: cAMP-PKA (glucose-activated signaling), NCR (nitrogen catabolism repression), GAAC (general amino acids control), and Msn2up (activation upon a broad array of stress factors).

Discussion

The successful derivation of monosomic strains demonstrated that it is possible to obtain an a priori designed set of strains with hundreds of partially inactive genes with tens of them exerting deleterious fitness effects. It implies that the ability to remain viable despite carrying extensive mutational burdens was not restricted to assorted combinations of affected genes but would likely hold for a broad variety of them. The viability was supported by the epistatic effect which was of positive value and remarkable strength. The contribution of epistasis was probably still underestimated. It has been recently exposed that many of the single deletions with negative fitness effects had undergone compensatory evolution under recurrent propagation of the strains carrying them (Puddu et al., 2019). Therefore, some of those deletions were likely missing among our strains or were partially compensated which made their joint predicted harm lower than it would be in an unaffected collection. All those potentially missing heterozygous effects were present in the freshly derived monosomic strains biasing downward the difference between obtained and expected fitness, that is, the estimated contribution of epistasis. This caveat actually strengthens our claim of detecting the positive epistatic component that is sizable and can even offset most of the total harm associated with single negative effects.

The ascent of systems biology has raised hopes that the “statistical” and “biological” sides of epistasis can be merged coherently (Moore and Williams, 2005; Phillips, 2008). Indeed, a truly functional explanation of the measured growth effects would require understanding how the material and energetic expenditures were altered by the same mutations when they were separate or combined. Such an ultimately metabolomic interpretation could be attempted only semi-quantitatively and for much simpler systems (Molenaar et al., 2009). Regarding proteome and transcriptome, the latter has the advantage of higher accuracy and repeatability, as well as sensitivity to products of weakly expressed genes. The relation between transcriptome and proteome is generally rigid in yeast. In yeast aneuploids, the bulk of changes observed in mRNA are generally reflected as proportional shifts in both the profile of translation and mature protein composition (Larrimore et al., 2020; Pavelka et al., 2010). The relation is not perfect as, for example, disomics do have a somewhat attenuated average expression of proteins coded at the doubled chromosome, but the outcome is still much closer to doubling than parity (Dephoure et al., 2014). The post-transcriptional attenuation can be more frequent and pronounced when single genes are doubled but this is a case much different from ours (Ascencio et al., 2021). Most importantly, the yeast transcriptome is currently unparalleled in providing broad and comprehensible insights into cell signaling and functioning under major transitions such as fast-slow growth, anabolism-catabolism, presence-absence of stress. In sum, although the analysis of mRNA obviously does not provide a full description of the cell’s state, it is a sensible introduction to it.

The simultaneous upregulation of genes coding for ribosomal proteins (RP) and downregulation of those coding for subunits of the proteasome was observed in all monosomic strains and is likely providing crucial insight into the functional reaction of the cell to monosomy. An absence of a single chromosome meant halving the dose of several genes coding for the translational apparatus that resided on it while tens of others remained unaffected. The cell could not restore the required stoichiometry by targeted overexpression of the affected genes. The favorable environmental conditions signaled that translation had to be intensified but the functioning ribosomes were scarce resulting in apparently indiscriminate overproduction and withholding of degradation of all needed proteins. The reaction was inadequate and costly, but it should be seen as an attempt at ad hoc rebalancing, not a prepared (evolved) response. This interpretation, strongly suggested by the transcriptomic data, appears still more plausible after considering the functional diversity of the 468 genes selected as haploinsufficient. Table 1 shows that the diversity was ample: of the one hundred Slim GO Biological Process categories, dozens could be linked to every chromosome-associated subset of deletions. The number of affected processes was higher than the number of deletions because a single deletion often belonged to several categories (Saccharomyces Genome Database). However, there was a telling common motif. The processes involved in the “protein synthesis apparatus” (our designation) were most often represented. Their predicted harm was also especially high, typically sufficient or exceeding that required to reduce the growth rate to the levels actually observed in monosomics. The distortion of biosynthesis was thus real and its relation to the observed transcriptomic reaction was probably causal, not merely coincidental.

Yeast Slim GO Biological Process categories of the tested deletions and the predicted and observed relative doubling rate of the monosomic strains.

Some of the other adjustments seen in the monosomic transcriptomes could be also linked to the dominant effect of translational inefficiency. A signal to increase biosynthesis would mobilize fermentation and curb oxidation which downregulated genes coding for the ROS-scavenging proteins. The Mns2-mediated and other stress responses were seemingly absent, contrary to previous reports (Sheltzer et al., 2012). But, among the hallmarks of stress reactions are decreased RP production and increased proteolysis (to remove destabilized proteins). These two processes were here pushed in opposite directions by the loss of RPs stoichiometry blurring the expected patterns. The non-homogeneity of reaction in other analyzed regulons—cAMP-PKA, NCR, GAAC—means that functions other than ribosome assemblage were affected seriously but differently, depending on the composition of genes made insufficient by the loss of a chromosome. The environment was the same and rich, with nothing in short supply, therefore perceived deficiency of metabolites resulted from internal distortions.

Datasheet 2 shows a very large number of differences between monosomics in the expression of single genes, the variation that would be difficult to summarize or, still more, to fully understand in functional terms. If only some of the challenges met by individual monosomics were similar, while so many others disparate, then why was the uniformly positive epistasis observed?

A possible answer starts with the realization that the cell is an aggregate of multiple functional modules (Hartwell et al., 1999). Our monosomic strains had many different modules (recall the Slim GO categories) that were negatively affected by partially inactivating mutations. These mutations did not interact with each other directly but rather through the modules they belonged to. Mutations that affected the most critical module(s) suppressed the negative effect of those mutations that impaired other, non-limiting, modules. The latter were either less damaged or less needed under current conditions. The epiphenomenon of positive epistasis for fitness reflected the fact that not all mutations exerted their negative effects, at least not at a full scale. In our case, it is not just speculation but rather the simplest and most prudent explanation linking the growth rate and transcriptomic data. Our experimental system is in a way special: a microbial cell with multiple modules compromised but exposed to favorable conditions and thus tested for the single capacity of growing fast. However, it is also illuminating. A unicellular organism is typically tested for only a few fitness-defining functions, they would be much different under prolonged starvation. The specialized cells of a complex organism are required to uphold only some, and different, elements of life. The modularity of the cell helps to understand how so many partial genetic insufficiencies can be carried by so many different organisms without causing a drastic decline in their fitness.

Interactions within pairs of gene knock-outs have attracted much attention and some attempts to understand the uncovered variety of fitness effects have relied on the concept of cellular modularity (Costanzo et al., 2020; Segrè et al., 2005; Szappanos et al., 2011). High-order epistasis for fitness, arising from interactions between at least several mutations, has been often mentioned in the context of “diminished returns” of consecutive beneficial variants (Barrick et al., 2009; Chou et al., 2011; Khan et al., 2011; Kryazhimskiy et al., 2014). It has been proposed that the pattern emerges from interactions of mutations through some ‘global’ phenotypic interface, e.g., fitness (MacLean et al., 2010; Perfeito et al., 2014; Schoustra et al., 2016). However, a similar overall effect could be theoretically generated by the entirety of ‘idiosyncratic’ interactions, without any intermediating factors (Lyons et al., 2020; Reddy and Desai, 2021). Indeed, this has been demonstrated experimentally, at least for a selected and moderately large set of mutations (Bakerlee et al., 2022). Finally, the diminishing returns epistasis of beneficial mutations has been attributed to cellular modularity by postulating that the contribution of each functional module to fitness must have its upper limit (Wei and Zhang, 2019). Considering that the negative epistasis between multiple positive effects can result from different mechanisms, we do not insist that the proposed by us explanation of positive epistasis between multiple negative effects is the only one possible. However, it can be especially applicable when the deleterious mutations are really numerous and distributed widely, though not necessarily evenly, across many cellular subsystems.

Materials and Methods

Single gene deletion strains

1.1. Selection of single gene deletions with a possible effect of haploinsufficiency

Deutschbauer et al. have assayed a complete collection of heterozygous single gene deletion strains and identified a total of 184 genes, 98 essential and 86 non-essential, as haploinsufficient for growth in rich medium, YPD (Deutschbauer et al., 2005). We included this set of genes in the present study. Using a different technique, Marek et al. have tested al1 142 essential heterozygous single-gene deletions and 946 non-essential ones selected as likely non-neutral for growth (Marek and Korona, 2016). We reviewed the latter study and, setting a level of false discovery rate at 0.15, accepted as many as 404 genes, 256 essential and 148 non-essential, as potentially haploinsufficient. The two sets obtained in two different studies, 184 and 404, overlapped in 112 cases. The overlap was clearly higher than expected, 14.3, if the two sets were just random samples from among 5,200 yeast genes. On the other hand, it was limited and suggested new growth assays were desirable. The new assays are described below, they involved all unique strains identified in the two studies minus 5 which were not present in our present strain collections and another 3 which were dropped as superfluous for the described below orthogonal design of experimental blocks. (They were also the 3 least promising based on the former assays.) Datasheet 1 lists the strains.

1.2. Control for single deletion strains

To correctly quantify the predictably small negative growth effects introduced by heterozygous single-deletion strains, an unbiased and precise estimate of a wild-type phenotype is needed. In the present study, we sought to accomplish this by setting not a single but multiple strains as a control. The reason was that the used here gene deletions were constructed over several years by multiple laboratories and were then handled with repeated rounds of propagation that could result in genetic divergence. We therefore considered it risky to choose a single strain as a control for all others. We looked for a group of strains that would be indistinguishable in terms of growth rate suggesting that they have acquired genetic alterations affecting this trait. In the case of non-essential genes, we reviewed the Saccharomyces Genome Database and found 25 ORFs termed “dubious”, at present nearly sure to be false ORFs, which were located between other ORFs and no closer to their START or STOP codons than 100 bp. Heterozygous strains carrying deletions of these genes were assayed for doubling rate in a way specific to this study (described below). After replicated assays, 16 deletion strains closest to the medium growth performance, and statistically not different from each other, were selected as the control ones. Knowing that the non-essential and essential strains differed somewhat in their origin and subsequent handling (Brachmann et al., 1998), we decided to derive a separate set of control strains for the latter. We selected 32 essential genes which were in the very center of a single, strong and narrow modal peak of the frequency distribution of estimates collected for the whole collection of heterozygous essentials (Marek and Korona, 2016). Again, after replicated tests of growth rate, 16 strains closest to the median value and not different from each other were selected to constitute the final control for essential gene deletions. Control strains of both groups are listed in Datasheet 1.

Monosomic strains

2.1. Parents of monosomic strains

In a former study, we used 32 strains that had one counter-selectable marker (URA3) close to the center of each of the 32 chromosomal arms and a drug resistance marker (kan) close to a centromere (Tutaj et al., 2022). Pairs of strains with the same centromere marker and the counter-selectable markers residing on either the left or right arm of the same chromosome were mated with each other. The resulting diploids were sporulated and tetrads dissected to get triple-marked haploid genotypes, URA3-kan-URA3. The latter were mated with a standard BY haploid strain of the opposite mating type. In the final set of 16 diploid strains, each strain had one chromosome triple-marked while its homolog and the rest of the 15 chromosome pairs were isogenic with the diploid strain BY4743.

2.2. Derivation and verification of monosomic strains

The diploid strains with triple-marked chromosomes were grown overnight in synthetic complete medium (SC) and then 50 to 500 µl samples of the resulting cultures were overlaid on standard 5-FOA (5-fluoroorotic acid) plates. Emerging colonies were transferred onto new 5-FOA plates and YPD plates with 200 mg/ml of geneticin. The goal was to identify variants that were able to grow on the 5-FOA but not geneticin plates as this indicated that all three marker genes were possibly lost together with an entire chromosome. Colonies identified in this way were often of different sizes suggesting that they were genetically heterogeneous. The next criterion was to find variants that would form visibly smaller colonies than those of the parental strain but rarely, though regularly, produce colonies similar to the latter. Such strains were grown as replicate small cultures (5 to 100 µl dependent on particular monosomic strain), tested for negligible frequency of cells forming large colonies, and collected into larger samples enabling isolation of DNA at amounts sufficient for the next generation sequencing. Monosomy was considered confirmed when the number of reads was halved for the whole stretch of a single, and the expected one, chromosome. Although simple, this protocol required multiple attempts to be completed for some chromosomes as few colonies tended to appear on the 5-FOA plates, most of them remained resistant to geneticin and those which passed these two criteria did not show the required reversibility, that is, the tendency to return occasionally to normal growth in apparently one step. Even if all these phenotypic criteria were met, the sequencing occasionally uncovered genomes other than those of pure monosomics, that is, with the whole and only single-chromosome removed. Nevertheless, once confirmed, the monosomic strains could have been propagated reliably on rich and synthetic media, including that with 5-FOA, as long as the recurrent emergence of fast-growing colonies was monitored and countered.

2.3. Control for monosomic strains

The monosomic strains were all derived by us and the derivation involved our stock of haploid BY4741 and BY4742 strains. These two were then crossed to obtain a diploid BY4743, devoid of the URA3MX4 and kanMX4 cassettes, and then used as a control in growth assays of the monosomic strains.

Estimation of DR and rDR

The collection of single gene deletion strains was arrayed on 6 flat-bottom 8×12 well titration plates with 150 µl aliquots per well. Within each plate, the first and last well contained clean YPD medium, rows 3 and 10 hosted control strains, and deletion strains occupied the rest of the plate. Plates were filled with either essential or non-essential deletions accompanied by the respective 16 control strains mentioned above. One plate contained both essential and non-essential deletions together with 8 essential and 8 non-essential control strains. Plates were inoculated from thawed samples at 1-5% and kept non-agitated for 48 hours at 30 °C when they reached approximately similar densities of stationary phase cells. Such conditioned micro-cultures were used to inoculate plates with fresh YPD at 0.5% and kept at 30 °C with 1,000 rpm agitation. The cultures were tested for OD (600 nm) every 0.5 h with a TECAN Infinity reader. Four independent replicates of the measurements, starting with independent conditionings, were carried out. OD reads were used to calculate DR (doubling rate). To get rDR (relative doubling rate), every DR estimate of an experimental strain was divided by an average DR of control strains residing on the same plate. In the case of the monosomic strains, the whole protocol was analogous except that all experimental and control strains were held within one plate. The cultures of monosomic strains used in this assay were tested to contain less than 1% of fast-growing cells at the time of OD measurements.

Analysis of DNA and RNA

An initial step in preparing for both the DNA and RNA analysis was to collect samples of monosomic cells that would be nearly free of the fast-growing revertant cells. Individual monosomic strains were grown in YPD at 30 °C up to stationary phase as replicate micro-cultures (5 to 100 µl). The latter were serially diluted and plated to test for the appearance of large colonies suggesting the rise of compensatory mutations. The micro-cultures in which colonies typical for a particular monosomic strain constituted more than 99% were pooled together. Such tested stationary cultures of monosomic cells were directly used to extract DNA as a template for high-coverage sequencing (PE 150, expected read depth ∼80). The resulting reads were mapped, using bowtie2 (Langmead and Salzberg, 2012), along standard sequences of yeast chromosomes: Ensembl release 100, S. cerevisiae genome R64-1-1. Duplicated reads were marked with MarkDuplicates (Picard Toolkit 2019. Broad Institute, GitHub). Samtools (1.15.1) were used for BAM files sorting, indexing and coverage analysis (Danecek et al., 2021).

In the case of RNA, the tested for purity cultures of monosomic cells were transferred to fresh YPD and incubated with agitation at 30 °C for 4 hours. Total RNA was then extracted with the RiboPure™ RNA Purification Kit. Three replicas of the monosomic and the control BY4743 strain were prepared in this way. Libraries preparation and PE 150 sequencing was carried out by Novagene. About 20 mln read pairs were obtained per sample. Quality control of the reads was performed with fastQC v0.11.9 (Andrew, 2010). The RNA reads were mapped to the above-specified standard sequence with Hisat2 v2.1.0 (Kim et al., 2015). The resulting alignment files were sorted and indexed with samtools (1.9). Transcript quantification was performed with cuffquant/cuffnorm v2.2.1 (Trapnell et al., 2012). Gene count data normalization (“TMM” method) and differential expression analysis (exact test) were performed in the EdgeR test(Robinson et al., 2010).

Acknowledgements

We thank J. Bobula and A. Pirog for their experimental assistance.

Additional information

Funding

National Science Centre of Poland grant NCN 2017/25/B/NZ2/01036 (RK)

National Science Centre of Poland grant NCN 2014/13/B/NZ8/04668 (KT)

Jagiellonian University grant DS/MND/WB/INoS/10/2018 (HT)

The open-access publication of this article was funded by the programme “Excellence Initiative Research University” at the Faculty of Biology of the Jagiellonian University in Kraków, Poland.

Author contributions

Conceptualization: HT, RK

Experimental design: HT, KT, RK

Experiments: HT, RK

Analysis: HT, KT, RK

Visualization: HT

Writing – original draft: RK

Writing – review & editing: HT, RK

Competing interests

The authors declare no competing interests.

Additional files

Data availability

All data generated or analyzed during this study are included in this published article (and its supplementary information files) or are deposited in the BioProject and GEO (Gene Expression Omnibus) public databases. Accession numbers can be provided upon request.

Datasheet 1 (separate Excel file)

Datasheet 2 (separate Excel file)

Supplementary Materials

Supplementary Text

Transient monosomy of chromosomes VII and XIII

The procedure of isolating monosomics described in Material and Methods was applied in the same way to all strains with marked chromosomes, including both VII and XIII. For both of them, we tended to get Ura- colonies (on 5-FOA agars) which were mostly large (wild-type size) with small ones amounting to a few percent of all isolates. Looking for prospective monosomics, we initially concentrated on the small colonies. Of no less than 100 isolates for each parent strain, not a single one met our criteria. Most were kanR and none of the remaining produced small and largely uniform colonies which would tend to revert to wild-type growth. Despite these non-encouraging signs, two of the most promising candidates for each chromosome were sequenced. Complex polyploidies instead of regular monosomy were detected in these trials.

We then examined the dominating large Ura colonies. They were emerging at median frequencies of 3.1E−06 and 6.3E−06 for chromosomes VII and XIII, respectively. (Number of colonies divided by the number of cells plated on 5-FOA.) For each parental strain, 108 colonies, collected from several separately inoculated selective agars, were taken at random for further assays. All these isolates grew fast when re-streaked. Only 2 of those derived from the strain hosting marked chromosome VII turned out to be kanR, none in the case of chromosome XIII. To interpret these results, it could be useful to recall that the URA3 marker was placed in the middle of both the left and right chromosomal arms while kan occupied one of ORFs closest to a centromere:

telomere—URA3kanURA3—telomere.

Its homolog was free of all three markers. In principle, two mitotic cross-overs, between telomere and URA3, on every arm, could produce one progeny cell with two marker-free chromosomes. It would be a regular diploid and thus of the wild-type growth rate. There are two arguments against this suggestion. First, double cross-over between URA3 and kan would be similarly frequent and therefore cells lacking URA3 but retaining kan would not be as rare as it was observed here. Second, we measured the frequency of crossing-over on the two chromosomes in our earlier study (Tutaj et al., 2022). Based on our estimates, two simultaneous cross-overs located within the distal halves of the arms of chromosomes VII and XIII would happen at a rate of 1.1E−08 and 7.9E−09, respectively, which is about two orders of magnitude lower than observed.

The fast-growing and marker-free isolates were then subjected to sporulation and tetrad dissection. For every chromosome, 8 isolates were selected at random and at least 10 tetrads were dissected. In each of the16 assayed strains, tetrads with four viable haploids constituted a clear majority as expected for the products of meiosis with a regular diploid cell. Two of each 8 so analyzed strains were further tested in high-coverage DNA sequencing. No sequences of the two markers or the MX4 cassette that hosted them were found. The three loci in question contained only the expected wild-type sequences with a coverage typical for the rest of a genome, that is diploid (Supplementary Figure S2).

To summarize, we did not find a single slow-growing monosomic strain for chromosomes VII and XIII even though we were looking for them especially intensely. The regularly arising fast-growing isolates with the searched phenotype were highly unlikely to result from recombinational events, such as a double cross-over. They were normal diploids with the chromosomes in question having the phenotypic alleles and DNA sequences characteristic of the non-marked homologs. These findings imply that marked chromosomes were lost and their non-marked homologs underwent endoreduplication.

The unsuitability of chromosome XII for monosomy research

The parent strain with triple-marked chromosome XII tended to produce unusually numerous colonies on the selective 5-FOA agars, mostly as large as those of regular diploid strains. We did not analyze those large colonies because we considered them to be likely produced by recombination and thus unrelated to monosomy. This chromosome contains the only yeast rDNA region which is long for about 1 Mb and is known to be a strong recombination hotspot.

Therefore, the quantitative arguments excluding double cross-overs, developed for chromosomes VII and XIII, were inapplicable here. Small colonies were also more numerous than in the case of other strains. The tests of markers and growth rate reversibility were applied to several hundred of them but failed in nearly all trials. A few most promising isolates were sequenced and none of them revealed the searched regular monosomy. We decided to stop our search of M12 at this point because the presence of rDNA made it impracticable.

Parallel and divergent shifts in transcriptomes of monosomic strains.

Heat maps show monosomic mRNA frequencies divided by respective diploid (control) ones. (a) Gene Ontology categories selected to demonstrate similarities in transcriptional profiles of monosomic strains. (b) Regulons demonstrating differences in gene expression between monosomic strains. (This figure is an expanded version of Fig. 4 in the main text.)

DNA whole genome sequencing coverage after the postulated endoreduplication.

Two isolates descending from the parental diploid strains with marked chromosomes VII or XIII are shown. They were subjected to sequencing after being found to lack phenotypic markers and produce four viable spores.