Schematic overview of high-throughput quantification of in vitro pneumococcal growth kinetics and the genome-wide discovery of individual genomic variation influencing in vitro pneumococcal growth.

Population genomic approaches using genome-wide association study (GWAS) offer a robust way to identify genomic variation influencing the growth dynamics of the pneumococcus. First, we collected 348 pneumococcal blood culture isolates that were obtained from 348 patients admitted to the hospital with invasive pneumococcal disease. The isolates were inoculated in a liquid medium in a highly standardised manner followed by measurement of the growth kinetics after every ten minutes for slightly over 15 hours using a microplate reader13. We then fitted a logistic function (see methods) to measure growth features, namely, the average growth rate (r), maximum growth density (Hmax) defined as Lmax, lag phase duration, and maximum growth change (ΔH), defined as the difference between the maximum and minimum density (Lmax-Lmin). Additionally, genomic DNA was extracted from the cultured isolates for whole genome sequencing using the Illumina HiSeq 4000 sequencing platform. The generated sequencing data was used to reconstruct a maximum-likelihood phylogenetic tree of the isolates for downstream analyses, for example, the assessment of phylogenetic signals for the estimated growth features. We also performed de novo assembly of the sequencing reads to generate contigs, which were used to map against the ATCC 700669 pneumococcal reference genome (GenBank accession: NC_011900) to identify consensus single nucleotide polymorphisms (SNPs). We also generated presence, and absence patterns for unitigs or variable length k-mer sequences based on the draft assembled contigs. The SNPs were used to generate the phylogeny of the isolates while the unitigs were used in a GWAS to identify variants associated with the estimated pneumococcal growth features. The images were created with BioRender (https://www.biorender.com/).

© 2024, BioRender Inc. Any parts of this image created with BioRender are not made available under the same license as the Reviewed Preprint, and are © 2024, BioRender Inc.

Growth curves for selected serotypes show the in vitro pneumococcal growth dynamics and estimated growth features inferred by mathematical modelling.

The growth curves above show the optical density at 620 nm monitored over 15 hours for an example dataset of 20 pneumococcal isolates representing different serotypes and sequence types (STs) based on the pneumococcal MLST scheme. We used the nonlinear optimisation algorithm to identify a combination of parameters with the best fit to a logistic curve based on the lowest residual sum of squares for the predicted value based on the model relative to the observed growth data (see methods). We fitted separate curves based on a subset of points for each replicate up to the end of the stationary phase or end of the experiment, whichever came first. We did this to get a better fit for the logistic curve during the growth phase, as it does not allow for negative growth, which happened after the stationary phase. We fitted the curves to infer the growth features for each of the six replicates. The averaged estimates for each replicate were then used for the downstream analyses. Additional plots showing fitted curves for all the isolates are shown in Supplementary Fig. 1.

The in vitro pneumococcal growth features show variable patterns dependent on the capsular serotype and genetic background or lineage.

(a) Chord diagram showing the association between pneumococcal capsular serotypes and lineages for the Dutch isolates included in this study. The serotypes were determined using in silico approaches based on the sequencing data using SeroBA85, while the lineages were defined using the PopPUNK framework86 based on the global pneumococcal sequencing cluster (GPSC) lineage nomenclature34. The thickness of the connecting lines between serotypes and lineages in the chord diagram represents the percentage of isolates belonging to each serotype and lineage. Due to capsule switching, some serotypes were associated with multiple lineages, and similarly, some lineages were associated with multiple serotypes. The violin plot shows the distribution of the estimated average growth rate across different serotypes. The distribution of the estimated growth features varied across serotypes with at least three isolates, as shown by the violin plots for the (b) average growth rate, (c) maximum growth change, (d) maximum growth change across serotypes, (e) lag phase duration growth features. Similarly, the distribution of the estimated growth features varied across lineages with at least three isolates, as shown by the violin plots for the (f) average growth rate, (g) maximum growth change, (h) maximum growth density, and (i) lag phase duration growth features. A comparison of the growth features between two serotypes belonging to the same lineages with at least three isolates is shown for the (j) growth rate, (k) maximum growth, (l) maximum growth change, and (m) the lag phase duration features. A comparison of the growth features between two lineages expressing the same serotype containing at least three isolates are shown for the (n) growth rates, (o) maximum growth, (p) maximum growth change, and (q) the lag phase features. Additional information regarding the isolates and their estimated growth features or parameters are available in Supplementary Data 2.

Ancestral reconstruction reveals a strong correlation between the estimated in vitro pneumococcal growth features and the whole-genome phylogeny of the isolates.

The evolution of the normalised estimated continuous growth features, namely average growth rate, maximum growth change, maximum growth density, and lag phase duration, were assessed by estimating the states of the internal nodes of the maximum likelihood phylogenetic tree of the isolates using a maximum likelihood ancestral reconstruction approach92. The length of the scale bar shows the number of nucleotide substitutions per site, while the colours represent the lowest and highest unnormalised growth features. The length of the scale bar shows the number of nucleotide substitutions per site, while the colours represent the lowest and highest unnormalised growth features. A phylogenetic tree showing bootstrap support values on the internal nodes of the tree is provided in Supplementary Data 3.

The estimated in vitro pneumococcal growth features show high phylogenetic signals and narrow-sense heritability.

(a) Bar plot showing the phylogenetic signal of the growth features using Pagel’s λ statistic38. Pagel’s λ measures the correlated evolution of the traits with the phylogenetic tree of the isolates. Pagel’s λ estimates close to zero indicate no phylogenetic signal, i.e., the growth features were independent of the phylogeny, while values close to one show a strong phylogenetic signal. The average growth rate, maximum growth change, and maximum growth density showed the highest phylogenetic signal, while the lag phase duration feature showed an intermediate signal. In contrast, the maximum growth rate defined by Arends et al.13 had the lowest phylogenetic signal, i.e., evolved more independently of the phylogeny. (b) Bar plot showing the narrow-sense heritability (h2) of the pneumococcal growth features based on unitig sequences. The narrow-sense heritability corresponds to the amount of variability in the traits, i.e., growth features, explained by the pneumococcal genetic variation. The heritability is the proportion of phenotypic variation explained by the bacterial genetics (PVE) parameter using the linear mixed effects model implemented in GEMMA96. A heritability value close to zero implies low heritability or minimal contribution of genetics to the variability of the traits. In contrast, values close to one suggest a strong influence of genetics on growth features. There was a substantial impact of pneumococcal genetics on all the growth features, with the highest influence seen for the average growth rate, maximum growth change, and maximum growth density, with a slightly lower contribution for the lag phase duration and the maximum growth rate measured by Arends et al.13. All the error bars represent the 95% confidence intervals. Additional information is provided in Supplementary Data 3.

Pneumococcal genetic variation associated with the in vitro growth features or parameters of the isolates.

Manhattan plot showing the statistical significance of the hits from the GWAS analysis based on unitigs (a) without adjusting for the serotype and (b) after adjusting for the serotype. The statistical significance (P-value) were inferred based on the likelihood ratio test using Pyseer94 mapped against the ATCC 700669 pneumococcal reference genome (GenBank accession: NC_011900101) to identify their location in the pneumococcal genome. The dots in the Manhattan plots with different colours represent GWAS for different growth features, as shown in the key. The red dotted line designates the genome-wide threshold for considering a variant as significantly associated, statistically, with the growth features. The blue dotted line represents the threshold for classifying variants as suggestively associated with the growth features. (c) Quantile–quantile plots showing the relationship between the observed statistical significance and the expected statistical significance for the GWAS of growth features not adjusting for the serotype. (d) Quantile–quantile (QQ) plots showing the relationship between the observed statistical significance and the expected statistical significance for the GWAS of growth features after adjusting for the capsular serotype.

Association between average in vitro pneumococcal growth rate with the prevalence of serotypes in invasive disease, nasopharyngeal carriage, and invasiveness.

(a) Scatter plot showing the relationship between the prevalence of serotypes among invasive disease isolates collected before the introduction of PCV7 in the Netherlands and the average in vitro growth rate (r) of the isolates inferred as shown in Fig. 1. A Spearman correlation test showed no evidence of an association between the growth rate and prevalence of serotypes among the invasive disease isolates. (b) Scatter plot showing the relationship between the inferred prevalence of serotypes in the nasopharyngeal carriage before the introduction of PCV7 in the Netherlands and the inferred average in vitro growth rate. Due to the unavailability of the nasopharyngeal carriage data, we imputed the prevalence of the serotypes in carriage by dividing the prevalence in invasive disease by the average invasiveness of each serotype. The invasiveness index was calculated based on the meta-analysis of data collected before PCV7 introduction across multiple countries (see methods). A Spearman correlation test of the growth rate and prevalence of serotypes in carriage suggested a negative association between the variables. The association remained even after excluding serotype 3, which seemed like an outlier. (c) The relationship between the invasiveness and the average growth rates of each serotype from the pre-vaccination data. A Spearman correlation test of the growth rates and invasiveness suggested no association between the variables. All the error bars represent the 95% confidence intervals. Additional information is provided in Supplementary Data 4.