The geography of the dataset.

(A) sample collection distribution by country; (B) the number of genomes of each lineage included from each continent.

Boxplot showing the distribution of accessory genes within each lineage.

Lineages with a small number of genomes were excluded from the analysis. L6, La1 and M. microti have significantly smaller accessory genomes compared to other lineages.

Analysis of the functional components in the core (A) and accessory (B) genome using EggNOG mapper and InterProScan.

(A) phylogenetic tree based on MTBC core genome. PCA based on the accessory genome data from Panaroo (B) and pangraph (C).

MTBC Accessory genome identified by Panaroo (A) and Pangraph (B).

Sub-lineage specific regions of differences (RDs) identified using pangenome-based and genome alignment-based approaches.

Sub-lineages are shown on the Y-axis coloured as per the legend and clustered based on the RD presence/absence patterns. RDs are listed on the X-axis, grouped by their pattern of presence/absence.

Visual presentation of deletion caused by RD702.

The truncated bglS gene was classified as a core gene with Panaroo but has specific deletions within, which can be problematic for pangenomic analyses.