The diversity and function of sourdough starter microbiomes

  1. Elizabeth A Landis
  2. Angela M Oliverio
  3. Erin A McKenney
  4. Lauren M Nichols
  5. Nicole Kfoury
  6. Megan Biango-Daniels
  7. Leonora K Shell
  8. Anne A Madden
  9. Lori Shapiro
  10. Shravya Sakunala
  11. Kinsey Drake
  12. Albert Robbat
  13. Matthew Booker
  14. Robert R Dunn
  15. Noah Fierer
  16. Benjamin E Wolfe  Is a corresponding author
  1. Department of Biology, Tufts University, United States
  2. Department of Ecology and Evolutionary Biology, University of Colorado, United States
  3. Cooperative Institute for Research in Environmental Sciences, University of Colorado, United States
  4. Department of Applied Ecology, North Carolina State University, United States
  5. North Carolina Museum of Natural Sciences, United States
  6. Department of Chemistry, Tufts University, United States
  7. Department of History, North Carolina State University, United States
  8. Danish Natural History Museum, University of Copenhagen, Denmark
4 figures, 1 video, 1 table and 1 additional file


The distribution of sourdough starters sampled in this study.

(A) Overview of the process of serial transfer of a sourdough starter. (B) Locations of the 500 sourdough starters analyzed in this study. Each dot represents one sourdough starter. (C-G) Characteristics of collected sourdough starters. In (D), RT = room temperature. In (G), ‘Individual’ = participant reported acquiring their starter from another individual (not a commercial source); ‘Business’ = participant reported acquiring their starter from a commercial source.

Figure 2 with 4 supplements
Process parameters and geography weakly predict the diversity of sourdough starters.

(A) Starters (n = 500) hierarchically clustered by Bray-Curtis dissimilarities. The stacked bar chart on the left shows the proportion of total reads across all samples belonging to the orders Rhodospirillales (AAB), Lactobacillales (LAB), and Saccharomycetales (yeast) (see Figure 2—source data 1, 2 for a complete list of these taxa). On the right, each column represents an individual sourdough starter. See Figure 2—source data 3 for co-occurrence analysis results. Below the barchart, + indicates samples selected for functional analysis (Figure 4). Continental U.S. geographic regions were clustered at two scales: k = 4 (B) and k = 15 (C). Dots represent individual samples. Each geographic cluster is encircled. Colored dots represent clusters where indicator taxa were significantly (p<0.05) associated with geographic clusters according to indicator species analysis. In (D) and (E), indicator strengths (Figure 2—source data 6) illustrate individual ASVs that are significantly associated with (D) process parameters including starter maintenance techniques and (E) climatic parameters. Each individual dot or triangle represents an individual ASV of bacterium or fungus, respectively.

Figure 2—source data 1

The most abundant bacterial and fungal taxa across the 500 sourdough starter samples that are not typically considered an active part of starter communities (e.g. excluding yeasts, lactic acid bacteria, and acetic acid bacteria).
Figure 2—source data 2

The most abundant yeast, lactic acid bacteria, and acetic acid bacteria species across the 500 sourdough starter samples.
Figure 2—source data 3

Co-occurrence statistics of sourdough yeasts and bacteria calculated with the R package ‘cooccur’.
Figure 2—source data 4

Predictors (n=33) included in PERMANOVA tests on bacterial and fungal dissimilarities.
Figure 2—source data 5

Abiotic properties are poor predictors of overall variation in both bacterial and fungal community composition across sourdough starters.
Figure 2—source data 6

Complete list of indicator taxa and summary statistics, as described in Figure 2.
Figure 2—figure supplement 1
Phylogenetic trees of (A) lactic acid bacteria (LAB) and (B) acetic acid bacteria (AAB) detected in the 500 sourdough starters.

Also included in trees are LAB isolate strains (n = 4) used in pairwise competition experiments and reference strains of LAB and AAB from RDP. Shading indicates unique clades at ≥97% patristic similarity.

Figure 2—figure supplement 2
Richness across starter microbial communities.

For each starter (n = 500) the total number of ASVs for (A) bacteria including both LAB and AAB, (B) yeast, and (C) yeast, LAB, and AAB combined. For A-C, the dashed red line denotes the median richness. For each starter sample, (D) shows the number of LAB and AAB versus yeast detected. We did not detect a correlation between LAB/AAB and yeast richness across starters (Spearman’s rho = 0.04, p>0.05).

Figure 2—figure supplement 3
A co-occurrence analysis showing all significant associations.

Isolates used in synthetic pairwise interaction experiments are in the inner circle, and relative abundance (within yeast and within bacteria) is indicated by the splined size of circles. The Bonferroni-corrected significance of these associations is indicated by the edge (line) thickness. The thickest lines represent associations where p values are < 0.001. Lines of medium thickness indicate p<0.01, and the thinnest lines represent p<0.05. The species are organized by kingdom, with yeast on the top part of the figure and bacteria on the bottom. Of the 16 significant interactions we detected, 14 were within-kingdom and two are cross-kingdom interactions. All associations were calculated with the R package Cooccur.

Figure 2—figure supplement 4
Geographic location is a weak predictor of fungal sourdough starter community and not a significant predictor of bacteria.

Pairwise comparisons of community dissimilarity (Bray-Curtis) and geographic distances with Mantel tests (Spearman rank correlations with 999 permutations). For each comparison, we compared both the whole dataset (n = 500) and the continental US only (n = 424). (A) Fungal versus bacterial community dissimilarity across all samples (rM = 0.04, p<0.05) and (B) across US (rM = 0.05, p<0.01). (C) Bacteria versus geographic distance with the whole dataset (rM 0.01, p>0.05) and (D) US only (rM = 0.01, p>0.05). (E) Fungi versus geographic distance with all data (rM = 0.23, p≤0.001) and (F) US only (rM 0.04, p≤0.001).

Figure 3 with 1 supplement
Growth rate and competitiveness fail to explain abundance patterns, but co-occurrence patterns in situ are recovered in pairwise coexistence experiments.

(A) All possible species 1:1 pairs were grown in 200 µL liquid flour media (n = 5) and 10 µL was serially transferred every 48 hr. This conceptual schematic follows one pairing, K. humilis and L. sanfranciscensis, to illustrate the experimental approach. (B) Mean relative abundance of pairs at the end of transfer six. Pairs where both isolates persisted (>1% relative abundance) at the end of the experiment are outlined; error bars are ± SE. For all replicates at transfers one, three, and six, see Figure 3—figure supplement 1. (C) Correlation between growth of individual isolates alone (CFUs of each isolate after six transfers) and a simple persistence index (the number of competitions where the isolate persisted) found a positive and significant relationship (detection limit of mean one percent abundance across replicates; Spearman’s ρ = 0.81, p=0.02). (D) Frequency of each taxon in the amplicon sequencing dataset and the number of competitions where that isolate persisted was positively associated, but not significant (Spearman’s ρ = 0.39, p=0.34). (E) Significant (Bonferroni-corrected p<0.05) patterns of co-occurrence between taxa in our amplicon sequencing (top) were replicated 7 of 8 times in our experimental manipulation (bottom). All pairwise experimental outcomes from transfer six are represented in the bottom part of the figure; the eight pairs that have significant co-occurrence associations are highlighted and the experimental outcomes that matched the co-occurrence data have an asterisk. Refer to Figure 2—figure supplement 3- Figure 2—source data 3 for all amplicon co-occurrence data.

Figure 3—source data 1

CFU counts and relative abundance data from competitions, transfers one, three, and six.
Figure 3—figure supplement 1
Pairwise competition experimental outcomes at transfers one, three, and six.

Pairs were inoculated at approximately equal densities and transferred every 48 hr. Percent relative abundance of all experimental pairs after each transfer are shown in each column. All replicates are represented. Pairs where both species persist (1% relative abundance detection limit) are outlined in black. Outcomes that were predicted by co-occurrence analysis in the amplicon sequencing (see Figure 2—figure supplement 3) have an asterisk.

Figure 4 with 3 supplements
Acetic acid bacteria are drivers of sourdough starter functional diversity.

Heatmap shows the relative abundances of VOCs (z-scores) across samples. Columns represent the 40 starter samples clustered with Bray-Curtis dissimilarities of VOC profiles, resulting in two main clusters. Rows show the top 48 VOCs clustered by correlation similarity. Numbered VOCs are unknown compounds. Top rows indicate the total percentage of AAB and the three measured functional outputs. Functional outputs were all predicted by % AAB including: (1) mean dough rise rate (ρ = −0.51, p<0.001), (2) the overall VOC composition represented by the first NMDS axis (see Figure 4—figure supplement 1; Mantel ρ = 0.73, p<0.001) and (3) the dominant sensory note (adj. R2 = 37%, p<0.01, see Figure 4—source data 2 for all sensory notes).

Figure 4—source data 1

The relationships between microbial taxa (lactic acid bacteria, acetic acid bacteria, and yeast) and functional outputs.
Figure 4—source data 2

Complete list of sensory panel notes.
Figure 4—source data 3

Dough rise data over the course of 36 hours of rise.
Figure 4—source data 4

Volatile organic compound profiles collected for a subset of 40 starters.
Figure 4—figure supplement 1
VOC data across replicate sourdough starters.

(A) The relationships between volatile organic compound (VOC) profiles represented by a non-metric multidimensional scaling ordination (NMDS). Shading and hulls indicate VOC community profiles from the same initial microbial inoculum. Starter inoculum explained most of the variation in VOC profiles (PERMANOVA R2 = 0.91 and p≤0.001). (B) A heatmap of all volatile organic compound (VOC) community profiles (n = 118) and compounds detected (n = 123 at≥0.0001 mean abundance). Colors indicate z scores. Rows represent individual sample replicates, and columns represent VOCs. Both rows and columns are clustered hierarchically with Bray-Curtis dissimilarities.

Figure 4—figure supplement 2
Dough rise rates are predicted by starting microbial inoculum (adj. R2 = 0.42, p<0.001).

Boxplots of starter inoculum (the 40 samples selected for functional analyses) by rate of dough rise (n = 107 samples included after quality filtering).

Figure 4—figure supplement 3
The four most frequently reported sensory notes from the 40 samples analyzed by an expert sensory panel.

Only notes that were reported more than five times are included in the analysis. Percent acetic acid bacteria correlated with differences in sensory notes, with acetic acid/vinegar versus fermented sour and green apple showing the strongest differences (Dunn test p=0.04 and 0.06 respectively).


Video 1
Dough rise analysis using a common garden sourdough starter approach.

Video shows the first of three batches of sterilized flour and water (n = 40, three replicates of each) that were inoculated with sourdough starters. Dough rise was measured by tracking the tops of each dough using video tracking software over the course of 36 hrs. Changes from the starting position were fitted with logistic growth curves using the R package GrowthCurver. Videos for each tube were trimmed if they fell to more than 5% of their maximum rise value. Dough rise rates ranged from 0.1 to 1.5 mm/hr. For scale, tubes are 103 mm tall with their caps. Doughs were removed part way through for placement of volatile organic compound collection bars which were present during hours 12–36. Video also available at:


Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Commercial assay or kitPowersoilQiagenCat No./ID: 47014
Sequence-based reagent515 fCaporaso et al., 2011PCR primerForward primer used for amplifying bacterial DNA for amplicon sequencing
Sequence-based reagent806 rCaporaso et al., 2011PCR primerReverse primer used for amplifying bacterial DNA for amplicon sequencing
Sequence-based reagentITS1fGardes and Bruns, 1993PCR primerForward primer used for amplifying fungal DNA for amplicon sequencing and Sanger sequencing
Sequence-based reagentITS2White et al., 1990PCR primerReverse primer used for amplifying fungal DNA for amplicon sequencing
Sequence-based reagentITS4White et al., 1990PCR primerReverse primer used for amplifying fungal DNA for Sanger sequencing
Sequence-based reagent27 fLane, 1991PCR primerForward primer used for amplifying bacterial DNA for amplicon sequencing and Sanger sequencing
Sequence-based reagent1492 rTurner et al., 1999PCR primerReverse primer used for amplifying bacterial DNA for amplicon sequencing and Sanger sequencing
Software, algorithmDada2Callahan et al., 2016Software package for identifying amplicon sequence variants (ASVs)
Software, algorithmraxml-HPCStamatakis, 2014Phylogenetic tree builder for taxonomic assignments of ASVs
Software, algorithmKaijuMenzel et al., 2016Metagenomic taxonomy assignment software using unassembled reads
DatabaseRefseq used with Kaiju for bacterial species assignments of metagenomic reads
Software, algorithmRR Core Team, 2019RRID:SCR_001905Used for statistical analyses
Software, algorithmMatlab-based DLTdv-5Hedrick, 2008Used for video tracking of sourdough height for dough rise profiles
OtherTwister PDMS stir barGerstelCollection medium for volatile organic compounds in functional assays
OtherLactobacilli MRS agarCriterionC5930Growth medium for the cultivation of lactic acid bacteria
OtherCHROMagar CandidaCHROMagarCA222Differential growth medium; creates differential pigmentation and growth phenotypes for distinguishing yeast
StrainLactobacillus sanfranciscensis 17B2This paperMW218985
StrainLactobacillus brevis 0092aThis paperMW218986
StrainLactobacillus paralimentarius 0316dThis paperMW218987
StrainLactobacillus plantarum 232This paperMW218988
StrainSaccharomyces cerevisiae 253This paperMW219042
StrainWickerhamomyces anomalus 163This paperMW219039
StrainKazachstania humilis 228This paperMW219040
StrainKazachstania servazzii 177This paperMW219041

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Elizabeth A Landis
  2. Angela M Oliverio
  3. Erin A McKenney
  4. Lauren M Nichols
  5. Nicole Kfoury
  6. Megan Biango-Daniels
  7. Leonora K Shell
  8. Anne A Madden
  9. Lori Shapiro
  10. Shravya Sakunala
  11. Kinsey Drake
  12. Albert Robbat
  13. Matthew Booker
  14. Robert R Dunn
  15. Noah Fierer
  16. Benjamin E Wolfe
The diversity and function of sourdough starter microbiomes
eLife 10:e61644.