Characterisation and comparison of semen microbiota and bacterial load in men with infertility, recurrent miscarriage, or proven fertility
Figures

Characterisation of semen microbiota composition at genera level.
(A) Heatmap of Log10 transformed read counts of top 10 most abundant genera identified in semen samples. Samples clustered into three major microbiota groups based mainly on dominance by Streptococcus (Cluster 1), Prevotella (Cluster 2), or Lactobacillus and Gardnerella (Cluster 3) (n=223, Ward’s linkage). (B) Relative abundance of the top 6 most abundant genera within each cluster. (C) Silhouette scores of individual samples within each cluster. (D) Species richness (p<0.0001; Kruskal-Wallis test) and (E) alpha-diversity (p<0.0001; Kruskal-Wallis test) significantly differed across clusters. (F) Assessment of bacterial load using qPCR showed Clusters 2 and 3 have significantly higher bacterial loads compared to Cluster 1. Dunn’s multiple comparison test was used as a post hoc test for between-group comparisons (*p<0.05, ****p<0.0001).

Genera-level categorisation of seminal microbiota identified three major clusters using average silhouette scores for number of clusters.

Co-occurrence network estimated with SparCC from 16S sequencing counts at species level.
Network representing co-occurrence patterns (edges), between various taxonomic units, assigned at species level (nodes). Edges are coloured by their estimated SparCC correlation coefficient (ρ). Edges with a SparCC bootstrapped p-value<0.05, ρ<0.25, and singleton nodes are not shown. Node colour represents network community membership. Node sizes are proportional to the mean relative abundance of their respective taxon.

Characterisation of semen microbiota composition at species level.
(A) Heatmap of Log10 transformed read counts of top 25 most abundant species identified in semen samples. Samples clustered into 15 microbiota groups. (B) Silhouette scores of individual samples in microbial groups. (C) Average silhouette scores for 15 clusters at species level.

Relative abundance and prevalence matrices of Flavobacterium in relation to semen quality and morphology.
(A) Relative abundance of Flavobacterium was significantly higher in samples with abnormal semen (p=0.0002, q=0.02). (B) Detection of Flavobacterium was significantly more prevalent in abnormal semen quality samples (p=0.0003). (C) Flavobacterium relative abundance was significantly higher in samples with <4% morphologically normal forms (p=0.0002, q=0.01). (D) Flavobacterium was also significantly more prevalent in samples with low percentage of morphologically normal sperm (p=0.0009).

Seminal quality and function parameters according to recruited cohorts.
Comparison of microscopic semen parameters (A) concentration (p<0.0001, Kruskal-Wallis rank-sum test), (B) progressive motility (p<0.0001, Pearson’s chi-squared test), and (C) morphology (p<0.0001, Pearson’s chi-squared test) suggested poor semen quality for male factor infertility (MFI) patients. Comparison of clinical semen qualities: (D) reactive oxidative species (ROS), (E) sperm DNA fragmentation index.

Ecological parameters of seminal microbiota for the recruited study cohorts.
(A) Species richness (p=0.30) and (B) Simpson’s diversity index (p=0.49) were not significantly different based on recruited study cohorts. Kruskal-Wallis tests with Dunn’s multiple comparison p-values demonstrated on the plots. (C) Bacterial load of seminal microbiota in recruited study cohorts. There were no significant differences in bacterial load based on recruited study cohorts using the number of 16S rRNA genes per 1 ml of semen (p=0.22, Kruskal-Wallis test).
Tables
Patient demographics and notable parameters of seminal quality and function for controls and study subjects.
Fisher’s exact tests for all except age. Chi-squared test for age (n=223).
Factor | Categories | Controls | Study cases | p-Value |
---|---|---|---|---|
DNA fragmentation index | Low | 45/114 (40%) | 69/114 (60%) | 0.0002*** |
High | 12/82 (15%) | 70/82 (85%) | ||
ROS | <3.77 RLU/s | 53/143 (37%) | 90/143 (63%) | 0.02* |
>3.77 RLU/s | 5/33 (15%) | 28/33 (85%) | ||
Semen volume | Optimal | 55/208 (26%) | 153/208 (84%) | 0.03* |
Suboptimal | 8/15 (53%) | 7/15 (47%) | ||
Age | <34 | 11/49 (22%) | 38/49 (88%) | 0.04* |
34–41 | 31/124 (25%) | 93/124 (85%) | ||
>41 | 21/50 (42%) | 29/50 (58%) | ||
Ethnicity | Caucasian | 39/156 (25%) | 117/156 (75%) | 0.10 |
Non-Caucasian | 24/67 (36%) | 43/67 (64%) | ||
Concentration | >15 M/ml | 58/182 (32%) | 124/182 (68%) | 0.01* |
<15 M/ml | 5/41 (21%) | 36/41 (79%) | ||
Progressive motility | >32% | 60/207 (29%) | 147/207 (71%) | 0.56 |
<32% | 3/16 (19%) | 13/16 (81%) | ||
Sperm morphology | >4% | 22/74 (30%) | 52/74 (70%) | 0.87 |
<4% | 41/144 (28%) | 103/144 (72%) | ||
Semen quality | Optimal | 24/78 (31%) | 54/78 (69%) | 0.53 |
Suboptimal | 39/145 (27%) | 106/145 (73%) |
Differential abundance analysis for bacterial genera with seminal quality and functional parameters.
Positive t-values indicate a positive relationship, and a negative t-value describes a negative relationship between relative abundance of taxa and seminal quality and function parameters. Significant relationships are indicated using p-values. q-Values represent Benjamini-Hochberg false discovery rate corrected p-values for multiple comparisons.
Sperm quality and function parameters | Genera | Welch’s t-statistic | p-Value | q-Value |
---|---|---|---|---|
Sperm DNA fragmentation | Finegoldia | –2.36 | 0.01* | 0.27 |
Cutibacterium | –2.20 | 0.02* | 0.27 | |
Porphyromonas | 2.16 | 0.03* | 0.27 | |
Varibaculum | 2.11 | 0.03* | 0.27 | |
ROS | Lactobacillus | 2.18 | 0.02* | 0.66 |
Corynebacterium | –2.04 | 0.04* | 0.66 | |
Semen quality | Flavobacterium | 3.39 | 0.0008*** | 0.02* |
Prevotella | 2.26 | 0.02* | 0.38 | |
Sperm concentration | Porphyromonas | –2.08 | 0.03* | 0.61 |
Sperm morphology | Flavobacterium | 3.64 | 0.0003*** | 0.01* |
Prevotella | 2.03 | 0.04* | 0.67 | |
Semen volume | Corynebacterium | 2.27 | 0.02* | 0.32 |
Actinotigum | –2.20 | 0.02* | 0.32 | |
Varibaculum | –2.16 | 0.03* | 0.32 |
Differential abundance analysis for bacterial species with seminal quality and functional parameters.
Positive t-values indicate a positive relationship, and a negative t-value describes a negative relationship between relative abundance of taxa and seminal quality and function parameters. Significant relationships are indicated using p-values. q-Values represent Benjamini-Hochberg false discovery rate corrected p-values for multiple comparisons.
Clinical factor | Species | Welch’s t-statistic | p-Value | q-Value |
---|---|---|---|---|
Sperm DNA fragmentation | Peptostreptococcaceae bacterium | 2.18 | 0.03* | 0.91 |
ROS | Lactobacillus iners | 2.24 | 0.02* | 0.94 |
Unidentified Anaerococcus | –2.03 | 0.04* | 0.94 | |
Semen quality | Unidentified Flavobacterium | 3.76 | 0.0002*** | 0.01* |
Corynebacterium tuberculostearicum | –2.06 | 0.04* | 0.82 | |
Semen volume | Corynebacterium tuberculostearicum | 2.64 | 0.008 | 0.24 |
Unidentified Varibaculum | –2.48 | 0.01 | 0.24 | |
Staphylococcus epidermidis | 2.35 | 0.01 | 0.24 | |
Unidentified Peptoniphilus | –2.32 | 0.02 | 0.24 | |
Dialister propionicifaciens | –2.24 | 0.02 | 0.24 | |
Prevotella colorans | –2.14 | 0.03 | 0.26 | |
Cohorts | Staphylococcus haemolyticus | 0.04 | 0.02 | 0.97 |
Differential abundance analysis for specific taxa at genera level for controls and cases with male factor infertility.
Positive t-values indicate a relationship, and a negative t-value describes a negative relationship between relative abundance of taxa and seminal quality and function parameters. Significant relationships are indicated using p-values. q-Values represent Benjamini-Hochberg false discovery rate corrected p-values for multiple comparisons.
Clinical factor | Genera | Welch’s t-statistic | p-Value | q-Value |
---|---|---|---|---|
Sperm DNA fragmentation | Cutibacterium | –2.56 | 0.01* | 0.31 |
Porphyromonas | 2.34 | 0.02* | 0.31 | |
Varibaculum | 1.96 | 0.051 | 0.53 | |
ROS | Finegoldia | –1.99 | 0.04* | 0.77 |
Sperm concentration | Finegoldia | 2.04 | 0.04* | 0.71 |
Sperm morphology | Flavobacterium | 3.64 | 0.0003*** | 0.01* |
Prevotella | 2.03 | 0.04* | 0.67 | |
Semen volume | Facklamia | 2.99 | 0.003** | 0.10 |
Actinotignum | –2.20 | 0.02* | 0.36 | |
Dialister | –1.99 | 0.04* | 0.36 |
Differential abundance analysis for specific taxa at species for controls and male factor infertility.
Positive t-values indicate a positive relationship, and a negative t-value describes a negative relationship between relative abundance of taxa and seminal quality and function parameters. Significant relationships are indicated using p-values. q-Values represent Benjamini-Hochberg false discovery rate corrected p-values for multiple comparisons.
Clinical factor | Species | Welch’s t-statistic | p-Value | q-Value |
---|---|---|---|---|
Sperm DNA fragmentation | Staphylococcus hominis | –2.32 | 0.02* | 0.68 |
ROS | Unidentified Flavobacterium | 2.42 | 0.01 | 0.54 |
Unidentified Anaerococcus | –2.12 | 0.03 | 0.54 | |
Schaalia radingae | –2.12 | 0.03* | 0.54 | |
Haemophilus parainfluenza | 2.02 | 0.04* | 0.54 | |
Semen quality | Unidentified Flavobacterium | 2.36 | 0.01* | 0.91 |
Semen volume | Dialister micraerophilus | –2.66 | 0.008** | 0.41 |
Corynebacterium tuberculostearicum | 2.27 | 0.02* | 0.44 | |
Staphylococcus epidermidis | 2.22 | 0.02* | 0.44 | |
Actinotignum schaalii | –2.00 | 0.04* | 0.45 | |
Cohorts | Staphylococcus haemolyticus | 0.04 | 0.01* | 0.68 |
Comparison of mean age and prevalence of ethnicities in study recruitment cohorts.
Ethnicity representation amongst recruited cohorts was not significantly different (p=0.38, chi-squared test). RPL: recurrent pregnancy loss, MFI: male factor infertility, UI: unexplained infertility.
Study cohort | Age (mean ±SD) | Ethnicity |
---|---|---|
Control (n=63) | 40.1±8 | 39/63 (62%) Caucasian |
24/63 (38%) Non-Caucasian | ||
RPL (n=46) | 38.2±5 | 35/46 (76%) Caucasian |
11/46 (24%) Non-Caucasian | ||
MFI (n=58) | 36.3±4.5 | 41/58 (70%) Caucasian |
17/58 (30%) Non-Caucasian | ||
UI (n=56) | 37±4.7 | 41/56 (73%) Caucasian |
15/56 (27%) Non-Caucasian |
Distribution of clinical factors, microscopic seminal parameters, confounding factors, and recruitment cohorts according to genera clusters.
Chi-squared tests.
Factors | Thresholds | Cluster 1 | Cluster 2 | Cluster 3 | p-Value |
---|---|---|---|---|---|
DNA frag index | Low | 60 (53%) | 39 (34%) | 15 (13%) | 0.47 |
High | 37 (45%) | 35 (43%) | 10 (12%) | ||
ROS | <3.77 RLU/s | 74 (52%) | 56 (39%) | 13 (9%) | 0.81 |
>3.77 RLU/s | 19 (58%) | 11 (33%) | 3 (9%) | ||
Semen volume | Optimal | 105 (50%) | 80 (38%) | 23 (12%) | 0.12 |
Suboptimal | 8 (53%) | 3 (20%) | 4 (27%) | ||
Cohorts | Control | 36 (57%) | 22 (35%) | 5 (8%) | 0.76 |
MFI | 26 (45%) | 25 (45%) | 7 (10%) | ||
RPL | 23 (50%) | 17 (37%) | 6 (13%) | ||
UI | 28 (50%) | 19 (34%) | 9 (16%) | ||
Age | <34 | 30 (61%) | 14 (29%) | 5 (10%) | 0.58 |
34–41 | 59 (48%) | 49 (40%) | 16 (12%) | ||
>41 | 24 (48%) | 20 (40%) | 6 (12%) | ||
Ethnicity | Caucasian | 82 (53%) | 57 (37%) | 17 (10%) | 0.58 |
Non-Caucasian | 31 (46%) | 26 (39%) | 10 (15%) | ||
Concentration | >15 M/ml | 93 (51%) | 67 (37%) | 22 (12%) | 0.96 |
<15 M/ml | 20 (49%) | 16 (39%) | 5 (12%) | ||
Progressive motility | >32% | 105 (51%) | 78 (38%) | 24 (11%) | 0.67 |
<32% | 8 (50%) | 5 (31%) | 3 (19%) | ||
Morphology | >4% | 37 (50%) | 24 (32%) | 13 (18%) | 0.19 |
<4% | 72 (50%) | 58 (40%) | 14 (10%) | ||
Semen quality | Optimal | 41 (53%) | 24 (31%) | 13 (16%) | 0.17 |
Suboptimal | 72 (50%) | 59 (17%) | 14 (33%) |
Richness and diversity of seminal bacterial based on seminal quality and function parameters.
Categorical classifications of seminal parameters were based on the clinically defined thresholds. Mann-Whitney tests for all except age. Kruskal-Wallis test was used for age.
Factors | Richness p-value | Diversity p-value |
---|---|---|
DNA frag index | 0.68 | 0.89 |
ROS | 0.25 | 0.23 |
Semen volume | 0.54 | 0.85 |
Age | 0.14 | 0.12 |
Ethnicity | 0.31 | 0.24 |
Concentration | 0.79 | 0.66 |
Progressive motility | 0.38 | 0.54 |
Morphology | 0.82 | 0.97 |
Semen quality | 0.74 | 0.90 |