Genetic ancestry analyses of Tanzanian TB patients.

A) Genetic ancestry proportions of the 1,444 Tanzanian TB patients and representative human populations who shared at least 1% of their most common genetic ancestry with the Tanzanians for K = 24. (ESN: Esan from Nigeria (1000G), LWK: Luhya from Kenya (1000G)). For all populations included in our study, see Supplementary Figure S2 for their geographic distribution and Figure S5 for the ancestry composition of all African populations included in this study. B) The geographical location of the representative populations shown in A are depicted with black circles, and the corresponding country is highlighted. The remaining African populations included in the analysis are represented by blue circles.

Spatial visualizations of the BS genetic ancestries and the genetic ancestries of the different self-identified ethnic groups among the TB patients in Tanzania.

The genetic ancestry was inferred by admixture with K = 24, and the interpolation of the ancestries was performed by using the pykrige module in Python (see methods). A) eBS genetic ancestry, B) seBS genetic ancestry, and C) wBS genetic ancestry. The populations included for spatial interpolations are marked with a black dot on the maps. The maps were created using the basemap module in Python. D) Geographical origin of the ethnic groups among our TB patient cohort. The Temeke Distric hospital in Dar es Salaam where the patients were recruited is marked with a red point. Note that for some ethnic groups, no geographical origin could be identified (Supplementary Table 1). E) Ancestry plots for the different ethnic groups with at least 10 patients from our TB patient cohort.

Human and bacterial genotypes by the severity measures

Characteristics of MTBC genotypes for all patients with either human or bacterial genetic data available.

Estimated associations between disease severity, human genetic ancestry and bacterial genotype.

Three variables as proxies for disease severity were included: Lung damage (mild versus severe), TB-score (mild versus severe), Bacterial load (continuous, log10 transformed). Binomial logistic regressions were performed on the data of HIV-negative patients and adjusting was done for age, sex, smoking, socioeconomic status, level of education, malnutrition, TB Type (relapse or new infection), and drug resistance status by including these variables in the model. For the ancestries and the interactions, the p-values were retrieved by performing a likelihood ratio test comparing a model including the ancestries and interactions to a model without them. This table combines the results of two logistic regressions per disease severity measure, one including an interaction and one without. The ancestries were transformed and categorized (see Methods) with category 1 comprising the lowest amount of the respective ancestry and category 3 (4 in the case of wBS) the highest amount.