(A) Comparative coverage of the draft P. copri DSM 18205 genome between individuals and within healthy and NORA groups. Gray points are median fragments per kilobase per million (FPKM) for 1-kb windows, gray lines within the plot are the interquartile range for each window, red and blue lines the LOWESS-smoothed average for NORA and healthy groups, respectively. Gray lines on the horizontal axis represent boundaries between assembled contigs. Regions are variably covered between subjects and groups, with several genomic islands lacking overall or especially variable (dark blue lines below the plot). (B) The presence (blue) or absence (gray) of previously-reported P. copri-unique marker genes (Segata et al., 2012) in 11 stool samples from five subjects of the Human Microbiome Project (HMP) are shown as a heatmap. We report, in columns, only those P. copri-specific markers showing variable presence/absence patterns across the considered HMP samples. Each row represents a different sample collection date, groups of rows represent subjects, and groups of columns correspond to different variably covered genomic islands. Strains of P. copri are defined by the presence and absence of particular genes, which remain stable for at least 6 months in these individuals. All inter- and intra-individual comparisons between rows are highly statistically significant (p<<0.001, ‘ Materials and methods’). (C) The P. copri pangenome was identified by finding P. copri ORFs in all HMP and NORA cohort subjects, and the presence or absence of these ORFs was calculated for each subject (‘Materials and methods’, Figure 3—figure supplement 1). Several ORFs are statistically significant biomarkers between healthy and NORA status (q<0.25) (Supplementary file 1B, ‘Materials and methods’).