Diversity for commonality in the evolutionary changes of the reduced genome to regain the growth fitness

Kenya Hitomi; Yoichiro Ishii; Bei-Wen Ying

doi:10.7554/eLife.93520.1

eLife assessment

This important report studies the recovery of genome-reduced bacterial cells in laboratory evolution experiments to understand how they regain their fitness. Through the analysis of gene expression and a series of tests, the authors discover distinct molecular changes in the evolved bacterial strains and propose that various mechanisms are employed to offset the effects of a reduced genome. While the findings have intriguing implications for understanding genome evolution, it is crucial to note that the evidence supporting these claims is incomplete due to insufficient experimental tests and statistical analysis.

https://doi.org/10.7554/eLife.93520.1.sa3

Significance of findings

important: Findings that have theoretical or practical implications beyond a single subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

incomplete: Main claims are only partially supported

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

As the genome encodes the information crucial for cell growth, a sizeable genomic deficiency often causes a significant decrease in growth fitness. Whether and how the decreased growth fitness caused by genome reduction could be compensated by evolution was investigated here. Experimental evolution with an Escherichia coli strain carrying a reduced genome was conducted in multiple lineages for approximately 1,000 generations. The growth rate, which was largely declined due to genome reduction, was considerably recovered, associated with the improved carrying capacity. Genome mutations accumulated during evolution were significantly varied across the evolutionary lineages and were randomly localized on the reduced genome. Transcriptome reorganization showed a common evolutionary direction and conserved the chromosomal periodicity, regardless of highly diversified gene categories, regulons, and pathways enriched in the differentially expressed genes. Genome mutations and transcriptome reorganization caused by evolution, which were found to be dissimilar to those caused by genome reduction, must have followed divergent mechanisms in individual evolutionary lineages. Gene network reconstruction successfully identified three gene modules functionally differentiated, which were responsible for the evolutionary changes of the reduced genome in growth fitness, genome mutation, and gene expression, respectively. The diversity in evolutionary approaches improved the growth fitness associated with the homeostatic transcriptome architecture as if the evolutionary compensation for genome reduction was like all roads leading to Rome.

Introduction

The genome encodes the information for cell growth, and its size is likely the evolutionary consequence ^{1, 2, 3}. To determine the essential genomic sequence of modern cells, removing redundant DNA sequences from the wild-type genome in bacteria, so-called genome reduction, has been challenged to a large extent ^{4, 5, 6, 7}. These efforts have been made to discover the minimal genetic requirement for a free-living organism growing under the defined conditions ^{8, 9, 10}. It resulted in the finding of the coordination of genome with cell growth, i.e., genome reduction significantly decreased the growth rate of Escherichia coli (E. coli) cells independent of culture media or growth forms ^{11, 12, 13}. Slow growth or fitness decline was commonly observed in the genetically reduced ^{13, 14} and chemically synthesized genomes ^{10, 15}.

The growth decrease could be recovered by experimental evolution. Growth fitness generally represents the adaptiveness of the living organism to the defined environment ¹⁶. Although the reduced genomes somehow showed differentiated evolvability compared to the wild-type genomes ^{17, 18, 19}, their evolutionary adaptation to the environmental changes has been successfully achieved ^{20, 21, 22}. Experimental evolution under the defined culture condition successfully increased the decreased growth rates of the reduced genomes ^{17, 20, 21} and fastened the slow-growing synthetic genome ²³. The evolutionary rescue of the growth rate must be associated with the changes in genomic sequence and gene expression benefited for the growth fitness, as what happened in nature adaptive evolution ^{24, 25, 26}. Our previous study showed that the decreased growth rate caused by the absence of a sizeable genomic sequence could be complemented by introducing the mutations elsewhere in the genome ^{20, 21}. Additionally, the changes in growth rate caused by either genome reduction or experimental evolution were dependent on the genome size but not the specific gene function ²⁰.

A genome-wide understanding of the evolutionary compensated fitness increase of the reduced genome is required. The experimental evolution compensated for the genome reduction-mediated growth changes were considered stringently related to transcriptome reorganization. Our previous studies found the conserved features in transcriptome reorganization, although the gene expression patterns were significantly disturbed by either genome reduction or experimental evolution ^{27, 28, 29}. It indicated that experimental evolution rescued the growth rate disturbed by genome reduction via the underlying mechanisms to maintain homeostasis as a growing cell. How the growth recovery of reduced genomes was achieved and whether it was in general are unclear. To address the questions, we conducted the experimental evolution with a reduced genome in multiple lineages and analyzed the evolutionary changes of the genome and transcriptome in the present study.

Results

Fitness recovery of the reduced genome by experimental evolution

Experimental evolution of the reduced genome was conducted to regain the growth fitness, which was decreased due to genome reduction. Serial transfer was performed with multiple dilution rates to keep the bacterial growth within the exponential phase (Fig. S1), as described ^{17, 20}. Nine evolutionary lineages were conducted independently (Fig. S2, Table S1). A gradual increase in growth rate was observed along with the generation passage, which was combined with a rapid increase in the early evolutionary phase and a slow increase in the later phase (Fig. 1A). Most evolved populations (Evos) showed improved growth fitness. Nevertheless, the final growth rates were considerably varied (Fig. 1B, upper), and the evolutionary dynamics of the nine lineages were somehow divergent (Fig. S2). It indicated the diversity in the evolutionary approaches for improved fitness. Eight of nine Evos achieved faster growth than the genome-reduced ancestor (Anc), whereas all Evos decreased in their saturated population densities (Fig. 1B, bottom). In comparison to the wild-type strain carrying the full-length genome (WT), the primarily decreased growth rate caused by genome reduction was significantly improved by experimental evolution (Fig. 1C, upper), associated with the considerable reduction in saturated density (Fig. 1C, bottom). It demonstrated that the experimental evolution could compensate for the genome reduction with a trade-off in population size (carrying capacity), consistent with the previous findings ^{20, 21}.

Fitness increase of the reduced genome mediated by experimental evolution.
A. Temporal changes in growth rate. Color variation indicates the nine evolutionary lineages. B. Growth rate and maximal population size of the reduced genome. Blue and pink indicate the common ancestor and the nine evolved populations, respectively. Standard errors are shown according to the biological replications (N=4∼6). C. Boxplots of growth rate and maximum. Cross and open circles indicate the mean and individual values, respectively. Statistic significance evaluated by Mann-Whitney U tests is indicated. D. Correlation between growth rate and maximum. Spearman’s rank correlation coefficient and p-value are indicated.

Intriguingly, a positive correlation was observed between the growth fitness and the carrying capacity of the Evos (Fig. 1D). It seemed that the present experimental evolution did not obey the r/K selection theory^{30, 31}, which was known as the trade-off relationship (negative correlation) between the growth rate and the carrying capacity ^32,33. Taking account of our previous finding that the colony growth rates of genome-reduced strains were proportional to the colony sizes ¹¹, the collapse of the trade-off law likely resulted from genome reduction. As the trade-off between growth fitness and carrying capacity was proposed to balance the cellular metabolism that resulted from the cost of enzymes involved ³³, the deleted genomic sequences might play a role in maintaining the metabolism balance.

Significant variation and random localization of genome mutations

Genome resequencing (Table S2) identified a total of 65 mutations fixed in the nine Evos (Table 1). The number of mutations largely varied among the nine Evos, from two to 13, and no common mutation was detected (Table S3). 51 out of 65 mutations occurred in the genes, and 45 out of 51 were SNPs. As 36 out of 45 SNPs were nonsynonymous, the mutated genes might benefit the fitness increase. In addition, the abundance of mutations was unlikely to be related to the magnitude of fitness increase. For instance, A2 accumulating only two mutations presented a highly increased growth rate compared to F2 of eight mutations (Table 1), which poorly improved the growth (Fig. 1B). B2, D2, and E2 all succeeded in fitness increase to an equivalent degree (Fig. 1B), whereas they fixed 13, 7, and 3 mutations, respectively (Table 1). The mutated genes were varied in 14 gene categories, somehow more frequently in the gene categories of Transporter, Enzyme, and Unknown function (Fig. S3). They seemed highly related to essentiality, as 11 out of 49 mutated genes were essential (Table S3). As the essential genes were known to be more conserved than nonessential ones ^{34, 35}, the high frequency of the mutations fixed in the essential genes suggested the mutation in essentiality for fitness increase was the evolutionary strategy for reduced genome. The large variety in genome mutations and no correlation of mutation abundance to fitness improvement strongly suggested that no mutations were specifically responsible or crucially essential for recovering the growth rate of the reduced genome.

Overview of fixed genome mutations.
The number of mutations in the nine Evos are shown separately and summed. SNP, N and S indicate single nucleotide substitution, nonsynonymous and synonymous SNP, respectively.

Additionally, there were no overlapped genomic positions for the 65 mutations in the nine Evos (Fig. 2A). Random simulation was performed to verify whether there was any bias or hotspot in the genomic location for mutation accumulation (Fig. 2B). The simulation of 65 mutations randomly occurred on the reduced genome was performed 1,000 times. The distance of the mutation (mutated genomic location) to the nearest genomic scar caused by genome reduction was calculated. Welch’s t-test was performed to evaluate the significance of the locational bias between the random mutations and the mutations accumulated in the Evos (Fig. 2C). As the mean of p values from simulations was insignificant (μ_p > 0.05), there was no locational bias for mutation accumulation caused by genome reduction.

Genomic localization of mutations.
A. Normalized genomic positions of all mutations. The vertical lines highlight the total 65 mutations fixed in the nine Evos. Color variation indicates the nine Evos. WT and Reduced represent the wild-type and reduced genomes used in the present study. B. Normalized genomic positions of random mutations. The simulation of 65 mutations randomly fixed in the reduced genome was performed 1,000 times. As an example of the simulation, the genomic positions of 65 random mutations are shown. The vertical lines in purple indicate the mutations. C. Statistic significance of the genome locational bias of mutations. The distance from the mutated location to the nearest genomic scar caused by genome reduction was calculated. The mutations accumulated in the nine Evos and the 1,000-time random simulation were all subjected to the calculation. The significance of genome locational bias of the mutations in Evos was evaluated by Welch’s t-test. The histogram of 1,000 tests for 1,000 simulated results is shown. The mean of p-values (μ_p) is indicated, which is within the 95% confidence interval (0.07< μ_p < 0.09).

Common evolutionary direction and homeostasis in transcriptome reorganization

Since no specificity was detected in the genome mutations, whether these mutations disturbed the genome-wide gene expression pattern was investigated. Hierarchical clustering and principal component analysis (PCA) showed that the evolved transcriptomes were directed to similar patterns but divergent to the WT transcriptome (Fig. S4). The evolutionary direction of the transcriptomes of the reduced genome was not approaching the wild-type transcriptome. As a global feature of gene expression, the chromosomal periodicity of the transcriptome was analyzed using the Fourier transform as previously described ^{28, 36}. As a result, the transcriptomes of all Evos presented a common six-period with statistical significance, equivalent to those of the wild-type and ancestral reduced genomes (Fig. 3A, Table S4). It demonstrated that the chromosomal architecture of gene expression patterns remained highly conserved, regardless of the considerably varied mutations. The homeostatic periodicity was consistent with our previous findings that the chromosomal periodicity of the transcriptome was independent of genomic or environmental variation ^{27, 28}.

Chromosomal periodicity of transcriptome and mutated gene expression.
A. Chromosomal periodicity of transcriptomes. The transcriptomes of the nine Evos are shown. Black lines, red curves, and red vertical lines indicate the gene expression levels, fitted periods, and locations of mutations, respectively. *Ori* and *dif* are indicated with the vertical broken lines. B. Boxplot of gene expression levels. Gene expression levels of the 49 mutated genes in the nine Evos and the remaining 3,225 genes are shown. Statistic significance evaluated by Welch’s t-test is indicated.

In addition, the genomic locations of the mutations seemed irrelevant to chromosomal periodicity (Fig. 3A, red lines). No mutagenesis hotspot was observed even if these mutations were accumulated on a single genome (Fig. 2A). Alternatively, the expression levels of the mutated genes were somehow higher than those of the remaining genes (Fig. 3B), which sounded reasonable. The ratio of the essential genes in the mutated genes was ∼ 22% (Table S3), much higher than the ratio (∼9%) of essential to all genes (302 out of 3,290) in the reduced genome. As the essential genes showed higher expression levels than the nonessential ones ³⁷, the high essentiality of the mutated genes might result in a higher mean expression level. On the other hand, the high frequency of the mutations fixed in the essential genes was unexpected because the essential genes were generally more conserved than nonessential ones ³⁴. It strongly suggested the essentiality mutation for homeostatic transcriptome architecture happened in the reduced genome.

Diversified functions and pathways of the differentially expressed genes

As the evolved transcriptomes were differentiated from those of the WT and reduced genomes (Fig. S4), the differentially expressed genes (DEGs) were further identified to discover the gene functions or biological processes contributing to the fitness changes. The abundance of DEGs among the Evos varied from 333 to 1,130 genes and of few overlaps (Fig. 4A). Most DEGs were unique to each evolutionary lineage, and the common DEGs across all Evos were only 108 genes (Table S5). The number of DEGs partially overlapped among the Evos declined significantly along with the increased lineages of Evos (Fig. 4B). Enrichment analysis showed that only the histidine-related pathways were significantly enriched in the common DEGs (Fig. 4C). Functional enrichment of the DEGs in individual Evos showed that the amino acid metabolism considerably participated and that the enriched pathways were poorly overlapped (Fig. S5). These findings strongly suggested no universal rule for evolutionary changes of the reduced genome.

Differentially expressed genes (DEGs) and their enriched functions.
A. Commonality of DEGs in the nine Evos. Closed circles represent the combinations of the Evos. Vertical and horizontal bars indicate the number of the overlapped DEGs in the combinations and the number of all DEGs in each Evos, respectively. The combinations with more than 20 DEGs in common are shown. B. The number of DEGs overlapped among the Evos. The numbers of DEGs overlapped across 2 to 9 Evos are shown. The number of Evos detected in the single Evos is indicated as 1. C. Enriched function in common. The KEGG and GO terms enriched in the common DEGs across the nine Evos are shown. The statistical significance (FDR) of the enriched pathways and biological processes is shown on a logarithmic scale represented by color gradation.

In comparison, 1,226 DEGs were induced by genome reduction. The common DEGs of genome reduction and evolution varied from 168 to 540, fewer than half of the DEGs responsible for genome reduction in all Evos (Fig. 5A). The conclusion remained consistent even if the DEGs were determined with an alternative method, RankProd (Fig. S6). Functional enrichment of the DEGs observed the specific transcriptional regulation and metabolic pathways participating in the transcriptome reorganization in response to genome reduction and evolution. Only σ38 was enriched in the genome reduction-mediated DEGs, which was the only regulon that partially overlapped with the Evos (Fig. 5B). No regulon in common was enriched, besides a few partially overlapped regulons, i.e., GadW, GadX, and RcsB (Fig. 5B). In addition, both the number of enriched pathways and their overlaps were significantly differentiated among the nine Evos or between Evos and reduced genome (Fig. 5C). No common pathways were commonly enriched between the genome reduction-mediated DEGs and Evos, no matter annotated the metabolic pathways with GO or KEGG (Fig. S7). The flagellar assembly, the only enriched pathway in genome reduction-mediated DEGs, was absent in all Evos (Fig. 5D). Alternatively, the amino acids-related metabolisms were frequently detected in the Evos, e.g., histidine metabolism, biosynthesis of amino acids, etc. (Fig. 5D, Fig. S7). The variable pathways in the Evos indicated that evolution compensated for the genome reduction in various ways, which differed from how the genome reduction was caused.

Transcriptome comparison between genome reduction and evolution.
A. Venn diagrams of DEGs induced by genome reduction and evolution. The numbers of individual and overlapped DEGs are indicated. B. Heatmap of enriched regulons. Statistically significant regulons are shown with the FDR values on a logarithmic scale. C. Number of enriched functions in common. Left and right panels indicate the numbers of enriched GO terms and KEGG pathways caused by genome reduction and evolution, respectively. D. Enriched functions in common. The overlapped GO terms enriched in the nine Evos and genome reduction are shown. Blue and pink represent genome reduction and evolution, respectively.

Gene modules responsible for the evolutionary changes of the reduced genome

Genome mutation analysis and transcriptome analysis failed to identify the common gene categories or pathways that correlated to the evolution of the reduced genome; thus, the gene modules correlated to evolution were newly evaluated. The weighted gene co-expression network analysis (WGCNA) ³⁸ was performed toward the evolved transcriptomes as tested previously ²⁷. Reconstruction of the 3,290 genes in the reduced genome led to 21 gene modules, comprising 8 to 320 genes per module (Fig. S8). Hierarchical clustering of these modules showed that roughly three major classes could be primarily divided (Fig. 6A). Functional correlation analysis showed that three of 21 modules (M2, M10, and M16) were highly significantly correlated to the growth fitness, the number of DEGs, and mutation frequency, respectively (Fig. 6A). It indicated that the three modules were highly essential and functionally differentiated for growth control, transcriptional change, and mutagenesis.

Reconstructed gene modules.
A. Cluster dendrogram of the gene modules reconstructed by WGCNA. A total of 21 gene modules (M1∼M21) were reconstructed. The significance of the correlation coefficients of the gene modules to growth, mutation, and expression is represented in purple gradation. From light to dark indicates the logarithmic p-values from high to low. B. Enriched functions of gene modules and deletion. Enriched gene categories, regulons, and GO terms are shown from left to right. The numbers of the genes assigned in the three gene modules and the genomic deletion for genome reduction are indicated in the brackets. Color gradation indicates the normalized p values on a logarithmic scale.

Enrichment analyses further identified the gene categories, regulons, and metabolisms that significantly appeared in the three modules (Fig. 6B). Two gene categories of nonessential function were enriched in M10, and the module correlated to the number of DEGs; in contrast, no gene category was enriched in M2 and M16 (Fig. 6B, left). All three modules successfully enriched the regulons without overlaps (Fig. 6B, middle). It indicates that the main regulatory mechanisms participating in the three modules for growth control, transcriptional change, and mutagenesis were divergent. GO enrichment resulted in various biological processes in the three modules, roughly related to transport, transposition, and translation in M2, M10, and M16, respectively (Fig. 6B, right). Compared to the enriched functions of the genes that disappeared due to genome reduction (Fig. 6B, bottom), the gene categories of phage and unknown function and the biological processes related to DNA transposition and integration were commonly identified in M10. It strongly indicated that M10 was responsible for genome reduction. The newly constructed gene networks successfully identified three modules correlated to mutation, DEGs, and growth, revealing the functional differentiation responsible for evolution to maintain the homeostatic transcriptome architecture for a growing cell.

Discussion

The evolutionary compensation for genome reduction was directed toward an identical goal of increased fitness but differentiated in the manner of genomic changes. Firstly, various genetic functions seemed to trigger the increased growth rates of the Evos. A few mutations could compensate for the sizeable genomic deficiency. The considerable variation in the fixed mutations without overlaps among the nine Evos (Table 1) implied no common mutagenetic strategy for the evolutionary improvement of growth fitness. It was supported by the fact that no genomic locational bias for mutations fixed in the evolution (Fig. 2). Secondly, the transcriptomes presented conserved chromosomal architectures (Fig. 3) and universal directional changes (Fig. S4), regardless of the significant and differentiated changes in gene expression in response to evolution (Fig. 4, Fig. S5). Although the periodicity of chromosomal architecture was well known ^{39, 40} and coordinated to the bacterial growth rate^{28, 41}, employing its homeostasis as the evolutionary consequence provided a conceptual and unique understanding of growing cells.

It’s unclear whether the differentiation in evolutionary paths was particularly significant for the reduced genome used in the present study. Evolution studies often focus on finding the common mutations accumulated in multiple evolutionary lineages to obtain the reasonable mechanism responsible for the adaptation to the defined condition (Fig. 7, i). Common mutations^{22, 42} or identical genetic functions⁴³ were reported in the experimental evolution with different reduced genomes. Nevertheless, divergent evolutionary mechanisms were proposed as not all mutations contribute to the evolved fitness ^{22, 43}. The present study accentuated the variety of mutations fixed during evolution. Considering the high essentiality of the mutated genes (Table S3), most or all mutations were assumed to benefit the fitness increase, partially demonstrated previously ²⁰. The differentiated mutations and DEGs guided to a homeostatic (Fig. 7, iii) rather than a variable consequence (Fig. 7, ii). Multiple evolutionary paths for the reduced genome to improve growth fitness were likely all roads leading to Rome.

Schematic drawing of evolutionary approaches for the reduced genome.

In addition, the transcriptome reorganization for fitness increase triggered by evolution differed from that for fitness decrease caused by genome reduction. General analyses failed to detect the regulatory network or genetic function mediated by genome reduction and evolution in common. Instead, the newly constructed gene modules successfully enriched the gene categories of mobile elements and unknown functions (Fig. 6B, left) as the evolutionary compensation for genome reduction. The represented mobile elements, flagella, were known to be responsive to environmental stresses such as hypoosmotic pressure or pH ^{44, 45}. Genome reduction and evolution seemed equivalent to the stress response in E. coli. These findings were reasonable as enterobactin protected E. coli from oxidative stress ^{46, 47}, and enterobactin biosynthesis was upregulated for biofilm formation in genome-reduced E. coli ⁴⁸. The compensation of evolution to genome reduction not only verified the known function and mechanism from a global regulatory viewpoint but also revealed a novel understanding of the molecular mechanisms and gene functions.

The discriminated functions of gene modules might play a crucial role in response to genomic and evolutionary changes. WGCNA was conducted to discover the potential correlation of gene expression to growth fitness. It succeeded in finding the genes participating in the evolutionary changes of the reduced genome to regain growth fitness. Three enriched gene modules were assumed separately responsible for replication, transcription, and population dynamics (Fig. 6B). The growth-correlated gene module significantly enriched the iron-related biological functions (M2). Although the translation was commonly reported to be correlated to the growth rate ^{49, 50}, it was enriched in the gene module coordinated to transcriptional changes (M10). Such functional differentiation of the gene modules might connect with the differentiated medium components responsible for varied bacterial growth phases, which was observed using the high-throughput growth assay in hundreds of medium combinations combined with machine learning ^{51, 52}. We assumed that the various chemicals disturbed different metabolic fluxes in which different gene modules might have participated. The biological meaningfulness of the gene modules suggested an alternative genetic classification besides the commonly used clustering criteria, such as Gene Orthology ⁵³ and Regulon ⁵⁴. In summary, the present study provided a representative example showing multiple evolutionary paths (i.e., gene mutation and expression) directed the reduced genome to the improved fitness with the homeostatic transcriptome.

Materials and Methods

E. coli strains and culture medium

The E. coli K-12 W3110 wild-type and its genome-reduced strains were initially distributed by the National BioResource Project of the National Institute of Genetics. The reduced genome was approximately 21% smaller than the wild-type genome. The minimal medium M63 was used as described in detail elsewhere ^{13, 55}.

Experimental evolution

The genome-reduced E. coli strain was evolved in 2 mL of the M63 medium by serial transfer, as previously described ^{17, 20}. Nine evolutionary lineages were all initiated from the identical culture stock prepared in advance. The 24-well microplates specific for microbe culture (IWAKI) were used, and every four wells of four tenfold serial dilutions, e.g., 10³∼10⁶, were used for each lineage. The microplates were incubated in a microplate bioshaker (Deep Well Maximizer, Taitec) at 37°C, with rotation at 500 rpm. The serial transfer was performed at ∼24-h intervals. Only one of the four wells (dilutions) showing growth in the early exponential phase (OD₆₀₀ = 0.01-0.1) was selected and diluted into four wells of a new microplate using four dilution ratios. Serial transfer was repeated until all evolutionary lineages reached approximately 1,000 generations, which required approximately two months per lineage. A total of nine lineages were conducted, and the daily records of the nine lineages were summarized in Table S1. The evolutionary generations (G) and the growth rates (μ) were calculated according to the following equations (Eq. 1 and Eq. 2).

Fitness assay

Both the ancestor and the evolved E. coli populations of the reduced genome were subjected to the fitness assay, as previously described ^{13, 55}. In brief, every 200 µl of culture was dispensed to each well in the 96-well microplate (Coaster). The microplate was incubated at 37°C in a plate reader (EPOCH2, BioTek), shaking at 567 cpm (cycles per minute) for 48 h. The temporal changes in OD₆₀₀ were measured at 30-min intervals. The growth fitness (r) was calculated using the following equation between any two consecutive points (Eq. 3).

Here, t_i and t_i+1 are the culture times at the two consecutive measurement points, and C_i and C_i+1 are the OD₆₀₀ at time points t_i and t_i+1. The growth rate was determined as the average of three consecutive ri, showing the largest mean and minor variance. The mean of the biological triplicates was defined as the growth fitness and used in the analyses.

Genome resequencing and mutation analysis

The E. coli cells were collected at the stationary growth phase (i.e., OD₆₀₀ > 1.0) and subjected to genome resequencing, as described previously ²⁰. In brief, the bacterial culture was stopped by adding rifampicin at 300 µg/mL. The cell pellet was collected for genomic DNA purification using a Wizard Genomic DNA Purification Kit (Promega) under the manufacturer’s instructions. The sequencing library was prepared using the Nextera XT DNA Sample Prep Kit (Illumina), and the paired-end sequencing (300 bp × 2) was performed using the Illumina HiSeq platform. The raw datasets were deposited in the DDBJ Sequence Read Archive under the accession number DRA013662. The sequencing reads were aligned to the reference genome E. coli

W3110 (AP009048.1, GenBank), and the mutation analysis was performed with the Breseq pipeline ^{56, 57}. The DNA sequencing and mapping statistics are summarized in Table S2. The mutations fixed in the coding regions due to evolution are shown in Table S3.

Calculation and simulation of the genomic positions of mutations

The distances from the genomic positions of the genome mutations fixed in the nine Evos to the nearest genomic scars caused by the genome reduction were calculated as described previously ^{13, 36}. Random mutations that occurred on the reduced genome were simulated 1,000 times, and the distances of these mutations to the nearest genomic scars were calculated. Note that the number of mutations in the simulation remained equivalent to that detected in the Evos. Welch t-tests were performed to evaluate the statistical significance (p values) of the bias in the distances observed in the Evos and the simulations. The distribution and the mean of 1,000 p values acquired from the simulation were calculated to evaluate the locational bias of the mutations fixed in Evos.

RNA sequencing

The E. coli cells were collected at the exponential growth phase (i.e., 5×10⁷ ∼ 2 ×10⁸ cells/mL) and subjected to RNA sequencing, as described previously ^{27, 41}. In brief, the bacterial growth was stopped by mixing with the iced 10% phenol ethanol solution. The cell pellet was collected to purify the total RNAs using the RNeasy Mini Kit (QIAGEN) and RNase-Free DNase Set (QIAGEN) according to product instructions. The paired-end sequencing (150 bp × 2) was performed using the Novaseq6000 next-generation sequencer (Illumina). The rRNAs were removed from the total RNAs using the Ribo-Zero Plus rRNA Depletion Kit (Illumina), and the mRNA libraries were prepared using the Ultra Directional RNA Library Prep Kit for Illumina (NEBNext). Biological replicates were performed for all conditions (N=2∼4). The raw datasets were deposited in the DDBJ Sequence Read Archive under the accession number DRA013662.

Data processing and normalization

The FASTQ files were mapped to the reference genome W3110 (accession number AP009048.1, GenBank) using the mapping software Bowtie2 ⁵⁸, as described previously ^{27, 41}. The obtained read counts were converted to FPKM values according to the gene length and total read count values. Global normalization of the FPKM values was performed to reach an identical mean value in the logarithmic scale in all datasets. The gene expression level was determined as the logarithmic value of FPKM, and the mean values of biological replicates were used for the following analyses (Table S6).

Computational analysis

The normalized datasets were subjected to the computational analyses performed with the R statistical analysis software. A total of 3290 genes were used for hierarchical clustering and principal component analysis (PCA), which were performed with the R functions of "hclust" and "prcomp", respectively, as described previously²⁷. The corresponding parameters of "method" and "scale" were set as "average" and "F", respectively. The R package of DESeq2⁵⁹ was used to determine the differentially expressed genes (DEGs), based on the false discovery rate (FDR < 0.05) ⁶⁰. The read counts were used as the input data for DESeq2, in which the data normalization was performed at each run for pair comparison.

Enrichment analysis

Functional enrichments were performed according to the features of gene category⁶¹, transcriptional regulation⁵⁴, gene ontology⁵³, and metabolic pathways^{62, 63}. Twenty-one gene categories and 46 regulons, which comprised more than 15 genes and 15 regulatees, were subjected to the enrichment, respectively. The statistical significance was evaluated by the binomial test with Bonferroni correction. The enrichment analysis of gene ontology (GO terms) ^{53, 64} and metabolic pathway (Kyoto encyclopedia of genes and genomes, KEGG) ^{62, 63} was performed using DAVID, a web-based tool for visualizing the characteristics of gene clusters with expression variation ⁶⁵ ⁶⁶. The statistical significance was according to FDR.

Chromosomal periodicity analysis

Fourier transform was used to evaluate the chromosomal periodicity of the transcriptome, as previously described ^{28, 41}. The genome was divided into compartments of 1 kb each, and the mean expression level of the genes within the corresponding sections was calculated. Gene expression levels were smoothed with a moving average of 100 kb and subjected to the periodicity analysis using the function "periodogram" in R. The max peak (periodic wavelength) of the periodogram was fitted to the gene expression data using the function "nls" in R by the least squares method according to the following equation (Eq. 4).

Here, a, b, T, and c represent the periodic amplitude, the periodic phase, the periodic wavelength indicated by the max peak, and the mean expression level of the whole transcriptome as a constant, respectively. The statistical significance of the periodicity was assessed with Fisher’s g test (Table S4), as described previously ^{28, 41}. The genomic position of ori was according to the previous reports ^{41, 67}. In addition, the function "abline" in R was used to point out the genomic positions of the mutations.

Gene network analysis

The weighted gene co-expression network analysis (WGCNA) of the nine Evos was performed with the R package of WGCNA ³⁸, as described previously ²⁷. A Step-by-Step method was used to determine the parameters for constructing the gene networks. The soft threshold was set at 12, where the R² of Scale Free Topology Model Fit was approximately 0.9 recommended by the developer’s instruction. The resultant gene networks were clustered with the "hclust" function (method=average) and reconstructed by merging similar modules using the "mergeCloseModules" function with a height cut of 0.25 in the "dynamic tree cut" method. Finally, a total of 21 modules were determined for 3,290 genes. The correlation coefficients and p-values between the expression of the gene modules and the other global features (growth rates, DEGs, and mutations) were evaluated using the functions "cor" and "corPvalueStudent" in R, respectively. FDR correction was applied to the p-values to account for multiplicity. Functional enrichment of the gene modules was performed, and the statistical significance was evaluated by the binomial test with Bonferroni correction as described above.

Acknowledgements

We thank NBRP for providing the E. coli strains carrying the wild-type and reduced genomes (W3110 and KHK collection). This work was supported by the JSPS KAKENHI Grant-in-Aid for Scientific Research (B) (grant number 19H03215) and partially by Grant-in-Aid for Challenging Exploratory Research (grant number 21K19815).

Competing interests

The authors declare that there are no competing interests.

References

1.
1. Koonin EV
2009Evolution of genome architectureInt J Biochem Cell Biol 41:298–306Google Scholar
2.
1. Lynch M
2006Streamlining and simplification of microbial genome architectureAnnu Rev Microbiol 60:327–349Google Scholar
3.
1. Lynch M
2. Conery JS
2003The origins of genome complexityScience 302:1401–1404Google Scholar
4.
1. Kotaka Y
2. Hashimoto M
3. Lee KI
4. Kato JI
2023Mutations identified in engineered Escherichia coli with a reduced genomeFront Microbiol 14Google Scholar
5.
1. Kato J
2. Hashimoto M
2007Construction of consecutive deletions of the Escherichia coli chromosomeMol Syst Biol 3Google Scholar
6.
1. Pósfai G
2. et al.
2006Emergent properties of reduced-genome Escherichia coliScience 312:1044–1046Google Scholar
7.
1. Hashimoto M
2. et al.
2005Cell size and nucleoid organization of engineered Escherichia coli cells with a reduced genomeMol Microbiol 55:137–149Google Scholar
8.
1. Aida H
2. Ying B-W
2023Efforts to Minimise the Bacterial Genome as a Free-Living Growing SystemBiology 12:1170Google Scholar
9.
1. Breuer M
2. et al.
2019Essential metabolism for a minimal cellElife 8Google Scholar
10.
1. Hutchison CA
2. et al.
20163rdDesign and synthesis of a minimal bacterial genome. Science 351Google Scholar
11.
1. Hitomi K
2. Weng J
3. Ying BW
2022Contribution of the genomic and nutritional differentiation to the spatial distribution of bacterial coloniesFront Microbiol 13Google Scholar
12.
1. Xue H
2. Kurokawa M
3. Ying BW
2021Correlation between the spatial distribution and colony size was common for monogenetic bacteria in laboratory conditionsBMC Microbiol 21Google Scholar
13.
1. Kurokawa M
2. Seno S
3. Matsuda H
4. Ying BW
2016Correlation between genome reduction and bacterial growthDNA Res 23:517–525Google Scholar
14.
1. Karcagi I
2. et al.
2016Indispensability of Horizontally Transferred Genes and Its Impact on Bacterial Genome StreamliningMol Biol Evol 33:1257–1269Google Scholar
15.
1. Gibson DG
2. et al.
2010Creation of a bacterial cell controlled by a chemically synthesized genomeScience 329:52–56Google Scholar
16.
1. Orr HA
2009Fitness and its role in evolutionary geneticsNat Rev Genet 10:531–539Google Scholar
17.
1. Nishimura I
2. Kurokawa M
3. Liu L
4. Ying BW
2017Coordinated Changes in Mutation and Growth Rates Induced by Genome ReductionmBio 8Google Scholar
18.
1. Csorgo B
2. Feher T
3. Timar E
4. Blattner FR
5. Posfai G
2012Low-mutation-rate, reduced-genome Escherichia coli: an improved host for faithful maintenance of engineered genetic constructsMicrobial cell factories 11Google Scholar
19.
1. Umenhoffer K
2. et al.
2010Reduced evolvability of Escherichia coli MDS42, an IS-less cellular chassis for molecular and synthetic biology applicationsMicrobial cell factories 9Google Scholar
20.
1. Kurokawa M
2. Nishimura I
3. Ying BW
2022Experimental Evolution Expands the Breadth of Adaptation to an Environmental Gradient Correlated With Genome ReductionFront Microbiol 13Google Scholar
21.
1. Choe D
2. et al.
2019Adaptive laboratory evolution of a genome-reduced Escherichia coliNat Commun 10Google Scholar
22.
1. Suzuki S
2. Horinouchi T
3. Furusawa C
2014Prediction of antibiotic resistance by gene expression profilesNat Commun 5:5792Google Scholar
23.
1. Moger-Reischer RZ
2. et al.
2023Evolution of a minimal cellNature 620:122–127Google Scholar
24.
1. Ishikawa A
2. et al.
2017Different contributions of local- and distant-regulatory changes to transcriptome divergence between stickleback ecotypesEvolution 71:565–581Google Scholar
25.
1. Rozen DE
2. de Visser JA
3. Gerrish PJ
2002Fitness effects of fixed beneficial mutations in microbial populationsCurr Biol 12:1040–1045Google Scholar
26.
1. Lynch M
2010Evolution of the mutation rateTrends Genet 26:345–352Google Scholar
27.
1. Matsui Y
2. Nagai M
3. Ying BW
2023Growth rate-associated transcriptome reorganization in response to genomic, environmental, and evolutionary interruptionsFront Microbiol 14Google Scholar
28.
1. Nagai M
2. Kurokawa M
3. Ying BW
2020The highly conserved chromosomal periodicity of transcriptomes and the correlation of its amplitude with the growth rate in Escherichia coliDNA Res 27Google Scholar
29.
1. Ying BW
2. Yama K
2018Gene Expression Order Attributed to Genome Reduction and the Steady Cellular State in Escherichia coliFront Microbiol 9:2255Google Scholar
30.
1. Engen S
2. Saether BE
2017r- and K-selection in fluctuating populations is determined by the evolutionary trade-off between two fitness measures: Growth rate and lifetime reproductive successEvolution 71:167–173Google Scholar
31.
1. Luckinbill LS
1978r and K Selection in Experimental Populations of Escherichia coliScience 202:1201–1203Google Scholar
32.
1. Molenaar D
2. van Berlo R
3. de Ridder D
4. Teusink B
2009Shifts in growth strategies reflect tradeoffs in cellular economicsMol Syst Biol 5Google Scholar
33.
1. Wortel MT
2. Noor E
3. Ferris M
4. Bruggeman FJ
5. Liebermeister W
2018Metabolic enzyme cost explains variable trade-offs between microbial growth rate and yieldPLoS Comput Biol 14:e1006010Google Scholar
34.
1. Jordan IK
2. Rogozin IB
3. Wolf YI
4. Koonin EV
2002Essential genes are more evolutionarily conserved than are nonessential genes in bacteriaGenome research 12:962–968Google Scholar
35.
1. Zhang J
2022Important genomic regions mutate less often than do other regionsNature 602:38–39Google Scholar
36.
1. Ying BW
2. Seno S
3. Kaneko F
4. Matsuda H
5. Yomo T
2013Multilevel comparative analysis of the contributions of genome reduction and heat shock to the Escherichia coli transcriptomeBMC Genomics 14Google Scholar
37.
1. Ying BW
2. Seno S
3. Matsuda H
4. Yomo T
2017A simple comparison of the extrinsic noise in gene expression between native and foreign regulations in Escherichia coliBiochem Biophys Res Commun 486:852–857Google Scholar
38.
1. Langfelder P
2. Horvath S
2008WGCNA: an R package for weighted correlation network analysisBMC Bioinformatics 9Google Scholar
39.
1. Mathelier A
2. Carbone A
2010Chromosomal periodicity and positional networks of genes in Escherichia coliMol Syst Biol 6Google Scholar
40.
1. Krogh TJ
2. Moller-Jensen J
3. Kaleta C
2018Impact of Chromosomal Architecture on the Function and Evolution of Bacterial GenomesFront Microbiol 9:2019Google Scholar
41.
1. Liu L
2. Kurokawa M
3. Nagai M
4. Seno S
5. Ying BW
2020Correlated chromosomal periodicities according to the growth rate and gene expressionSci Rep 10Google Scholar
42.
1. Choe D
2. et al.
2019Adaptive laboratory evolution of a genome-reduced Escherichia coliNature Communications 10Google Scholar
43.
1. Lu H
2. et al.
2022Primordial mimicry induces morphological change in Escherichia coliCommun Biol 5Google Scholar
44.
1. Ikeda T
2. et al.
2020Hypoosmotic stress induces flagellar biosynthesis and swimming motility in Escherichia albertiiCommun Biol 3Google Scholar
45.
1. Maurer LM
2. Yohannes E
3. Bondurant SS
4. Radmacher M
5. Slonczewski JL
2005pH regulates genes for flagellar motility, catabolism, and oxidative stress in Escherichia coli K-12J Bacteriol 187:304–319Google Scholar
46.
1. Adler C
2. Corbalan NS
3. Peralta DR
4. Pomares MF
5. de Cristóbal RE
6. Vincent PA
2014The alternative role of enterobactin as an oxidative stress protector allows Escherichia coli colony developmentPLoS One 9:e84734Google Scholar
47.
1. Peralta DR
2. Adler C
3. Corbalán NS
4. Paz García EC
5. Pomares MF
6. Vincent PA
2016Enterobactin as Part of the Oxidative Stress Response RepertoirePLoS One 11:e0157799Google Scholar
48.
1. May T
2. Okabe S
2011Enterobactin is required for biofilm development in reduced-genome Escherichia coliEnviron Microbiol 13:3149–3162Google Scholar
49.
1. Scott M
2. Hwa T
2022Shaping bacterial gene expression by physiological and proteome allocation constraintsNat Rev Microbiol Google Scholar
50.
1. Dai X
2. et al.
2016Reduction of translating ribosomes enables Escherichia coli to maintain elongation rates during slow growthNat Microbiol 2Google Scholar
51.
1. Aida H
2. Hashizume T
3. Ashino K
4. Ying BW
2022Machine learning-assisted discovery of growth decision elements by relating bacterial population dynamics to environmental diversityElife 11Google Scholar
52.
1. Ashino K
2. Sugano K
3. Amagasa T
4. Ying BW
2019Predicting the decision making chemicals used for bacterial growthSci Rep 9:7251Google Scholar
53.
1. Ashburner M
2. et al.
2000Gene Ontology: tool for the unification of biologyNature Genetics 25:25–29Google Scholar
54.
1. Salgado H
2. et al.
2013RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and moreNucleic Acids Res 41:D203–213Google Scholar
55.
1. Kurokawa M
2. Precise Ying BW.
2017High-throughput Analysis of Bacterial GrowthJ Vis Exp Google Scholar
56.
1. Deatherage DE
2. Traverse CC
3. Wolf LN
4. Barrick JE
2014Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseqFront Genet 5Google Scholar
57.
1. Barrick JE
2. et al.
2014Identifying structural variation in haploid microbial genomes from short-read resequencing data using breseqBMC Genomics 15:1039Google Scholar
58.
1. Langmead B
2. Salzberg SL
2012Fast gapped-read alignment with Bowtie 2Nat Methods 9:357–359Google Scholar
59.
1. Love MI
2. Huber W
3. Anders S
2014Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2Genome Biology 15Google Scholar
60.
1. Storey JD
2002A direct approach to false discovery ratesJournal of the Royal Statistical Society: Series B (Statistical Methodology 64:479–498Google Scholar
61.
1. Riley M
2. et al.
2006Escherichia coli K-12: a cooperatively developed annotation snapshot--2005Nucleic Acids Res 34:1–9Google Scholar
62.
1. Kanehisa M
2. Sato Y
3. Kawashima M
4. Furumichi M
5. Tanabe M
2016KEGG as a reference resource for gene and protein annotationNucleic Acids Res 44:D457–462Google Scholar
63.
1. Kanehisa M
2. Goto S
2000KEGG: kyoto encyclopedia of genes and genomesNucleic Acids Res 28:27–30Google Scholar
64.
1. Carbon S
2. et al.
2009AmiGO: online access to ontology and annotation dataBioinformatics 25:288–289Google Scholar
65.
1. Huang DW
2. et al.
2007DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene listsNucleic Acids Res 35:W169–175Google Scholar
66.
1. Huang da W
2. Sherman BT
3. Lempicki RA
2009Systematic and integrative analysis of large gene lists using DAVID bioinformatics resourcesNat Protoc 4:44–57Google Scholar
67.
1. Bryant JA
2. Sellars LE
3. Busby SJ
4. Lee DJ
2014Chromosome position effects on gene expression in Escherichia coli K-12Nucleic Acids Res 42:11383–11392Google Scholar

Article and author information

Author information

Kenya Hitomi
School of Life and Environmental Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8572, Japan
Yoichiro Ishii
School of Life and Environmental Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8572, Japan
Bei-Wen Ying
School of Life and Environmental Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8572, Japan
ORCID iD: 0000-0003-2517-5686
- Correspondence: ying.beiwen.gf@u.tsukuba.ac.jp

Version history

Sent for peer review: November 8, 2023
Preprint posted: November 12, 2023
Reviewed Preprint version 1: January 23, 2024
Reviewed Preprint version 2: April 19, 2024
Version of Record published: May 1, 2024

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.93520. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 1,189
downloads: 104
citation: 1

Views, downloads and citations are aggregated across all versions of this paper published by eLife.