Large-scale whole genome sequencing of M. tuberculosis provides insights into transmission in a high prevalence area

  1. JA Guerra-Assunção
  2. AC Crampin
  3. RMGJ Houben
  4. T Mzembe
  5. K Mallard
  6. F Coll
  7. P Khan
  8. L Banda
  9. A Chiwaya
  10. RPA Pereira
  11. R McNerney
  12. PEM Fine
  13. J Parkhill
  14. TG Clark
  15. JR Glynn  Is a corresponding author
  1. London School of Hygiene and Tropical Medicine, United Kingdom
  2. Karonga Prevention Study, Malawi
  3. Wellcome Trust Sanger Institute, United Kingdom
4 figures and 3 tables

Figures

Phylogenetic tree of all samples from Karonga.

Lineages form monophyletic groups within the phylogeny, as expected. Lineage 1 (Indo Oceanic) is represented in dark blue, Lineage 2 (Beijing/East Asian) in light blue, Lineage 3 (East African …

https://doi.org/10.7554/eLife.05166.003
Figure 2 with 1 supplement
Pairwise SNP distances between all pairs of samples with known RFLP.

The y axis shows the relative frequency within each subgroup: same RFLP pattern (red), different RFLP patterns (blue); same individual, same RFLP (green). (A) shows the full data set, and (B) is …

https://doi.org/10.7554/eLife.05166.005
Figure 2—figure supplement 1
Pairwise mutation rates between all pairs of samples with known RFLP (calculated as number of SNPs/number of days between dates of disease onset between individuals).

The y axis shows the relative frequency within each subgroup: same RFLP pattern (red), different RFLP patterns (blue); same individual, same RFLP (green). (A) shows the full data set, and (B) is …

https://doi.org/10.7554/eLife.05166.006
Figure 3 with 1 supplement
Examples of clusters built using SeqTrack.

All clusters are shown in Figure 3—figure supplement 1. Each polygon represents a patient, with larger polygons representing two or more patients with identical sequences. The patient details are …

https://doi.org/10.7554/eLife.05166.007
Figure 3—figure supplement 1
Clusters built using SeqTrack.

Each polygon represents a patient, with larger polygons representing two or more patients with identical sequences. The patient details are written inside the polygon: F = female, M = male. The …

https://doi.org/10.7554/eLife.05166.008
Figure 4 with 2 supplements
Distribution of clusters and SNPs.

(A) Number of clusters of different sizes and percentage of patients in clusters of different sizes. Cluster size 1 refers to unclustered patients. (B) Cluster size by lineage. The p values are for …

https://doi.org/10.7554/eLife.05166.009
Figure 4—figure supplement 1
Relationship between number of SNPs and the number of days between samples from individuals with more than one specimen available from the same of episode of disease or from a relapse.

For each individual, we selected the first and last specimens if there were more than two. (Random noise has been introduced to allow multiple similar results to be visualized.) The slope is given …

https://doi.org/10.7554/eLife.05166.010
Figure 4—figure supplement 2
Relationship between number of SNPs and the number of days between dates of disease onset for transmissions identified from the network, by lineage.

(Random noise has been introduced to allow multiple similar results to be visualized.) The slopes are given in SNPs/year.

https://doi.org/10.7554/eLife.05166.011

Tables

Table 1

Characteristics of patients included in the analysis and distribution of lineages

https://doi.org/10.7554/eLife.05166.004
LineageOverallp*
1234
Overall269 (16.0)74 (4.4)205 (12.2)1139 (67.5)1687
Age
 <209 (12.3)7 (9.6)9 (12.3)48 (65.7)73
 20–2946 (10.3)26 (5.8)48 (10.7)327 (73.2)447
 30–39109 (18.4)17 (2.9)81 (13.7)386 (65.1)593
 40–4961 (19.8)18 (5.8)39 (12.7)190 (61.7)308
 50+44 (16.5)6 (2.3)28 (10.5)188 (70.7)2660.001
Sex
 Female130 (14.6)47 (5.3)94 (10.6)617 (69.5)888
 Male139 (17.4)27 (3.4)111 (13.9)522 (65.3)7990.02
Year
 1995–199855 (15.5)8 (2.3)29 (8.2)263 (74.1)355
 1999–200143 (11.5)23 (6.1)43 (11.5)266 (70.9)375
 2002–200480 (19.4)22 (5.3)54 (13.1)257 (62.2)413
 2005–200754 (17.4)11 (3.5)44 (14.2)202 (65.0)311
 2008–201037 (15.9)10 (4.3)35 (15.0)151 (64.8)2330.004
TB type
 Smear+212 (17.3)52 (4.3)156 (12.8)804 (65.7)1224
 Smear−46 (12.1)19 (5.0)38 (10.0)276 (72.8)379
 Extrapulmonary11 (13.1)3 (3.6)11 (13.1)59 (70.2)840.1
HIV status
 Negative47 (10.8)23 (5.3)57 (13.0)310 (70.9)437
 Positive148 (19.3)28 (3.6)107 (13.9)486 (63.2)7690.001
Previous TB
 No251 (16.7)66 (4.4)171 (11.4)1019 (67.6)1507
 Yes18 (10.0)8 (4.4)34 (18.9)120 (66.7)1800.007
Isoniazid resistance
 Resistant20 (17.2)0 (0.0)21 (18.1)75 (64.7)116
 Sensitive244 (15.9)74 (4.8)181 (11.8)1033 (67.4)15320.03
Residence
 Karonga198 (16.4)53 (4.4)148 (12.3)806 (66.9)1205
 Malawi48 (16.6)13 (4.5)32 (11.1)196 (67.8)289
 Other country11 (11.5)7 (7.3)17 (17.7)61 (63.5)960.4
Birth place
 Karonga174 (17.0)46 (4.5)135 (13.2)667 (65.3)1022
 Malawi55 (16.3)14 (4.1)31 (9.2)238 (70.4)338
 Other country34 (11.7)14 (4.8)37 (12.7)206 (70.8)2910.2
  1. *

    From Χ2 comparison between lineages.

Table 2

Characteristics associated with disease due to recent infection

https://doi.org/10.7554/eLife.05166.012
CharacteristicLinked/TotalAssociation with links (unadjusted)p (lrtest)Adjusted for age, sex, year, lineageAdjusted for other variables included in model*p (lrtest)
n/N%OR (95% CI)OR (95% CI)OR (95% CI)
Overall409/107438.1
Lineage
 156/18330.60.76 (0.53–1.1)0.81 (0.57–1.2)0.81 (0.57–1.2)
 234/5265.43.2 (1.8–5.9)3.0 (1.6–5.4)3.2 (1.7–5.8)
 358/12945.01.4 (0.96–2.1)1.5 (1.0–2.2)1.5 (1.0–2.2)
 4261/71036.81<0.00111<0.001
Age
 <2019/3665.82.9 (1.4–6.0)2.5 (1.2–5.4)2.6 (1.2–5.6)
 20–29113/27645.81.8 (1.2–2.7)1.6 (1.1–2.5)1.8 (1.2–2.8)
 30–39152/40439.61.5 (1.0–2.3)1.5 (0.99–2.2)1.6 (1.0–2.3)
 40–4981/20144.21.7 (1.1–2.7)1.0 (1.0–2.6)1.7 (1.1–2.6)
 50+44/15733.510.007110.03
Sex
 Female229/57539.81
 Male180/49936.10.85 (0.67–1.1)0.050.93 (0.72–1.2)0.94 (0.72–1.2)0.4
Year
 1999–2001141/31145.3111<0.001
 2002–2004117/32236.30.69 (0.50–0.95)0.73 (0.52–1.0)0.69 (0.50–0.97)
 2005–200792/24437.70.73 (0.52–1.0)0.78 (0.55–1.1)0.70 (0.49–1.0)
 2008–201059/19730.00.52 (0.35–0.75)0.0010.53 (0.36–0.77)0.48 (0.32–0.70)
TB type
 Smear-positive pulmonary312/82138.011
 Smear-negative pulmonary97/25338.31.0 (0.76–1.4)0.90.95 (0.71–1.3)
HIV status
 HIV−102/28336.01
 HIV+ no ART173/43639.71.2 (0.85–1.6)1.1 (0.75–1.5)
 HIV+ on ART27/7735.10.96 (0.56–1.6)0.51.0 (0.56–1.8)
INH resistance
 No375/97938.311
 Yes28/6443.81.3 (0.75–2.1)0.41.4 (0.81–2.3)
 Unknown
Recent residence
 Karonga328/81640.2110.005
 Other Malawi56/17631.80.69 (0.49–0.98)0.58 (0.41–0.84)0.58 (0.40–0.84)
 Other country16/5429.60.63 (0.34–1.1)0.040.48 (0.26–0.91)0.48 (0.26–0.91)
Birth place
 Karonga267/65940.511
 Other Malawi81/22735.70.81 (0.60–1.1)0.79 (0.57–1.1)
 Other country59/18032.80.72 (0.51–1.0)0.10.67 (0.47–0.97)
  1. In this analysis individuals are defined as linked (‘backwards links’) using the cut-offs described in the text and if the closest link was with a patient within the previous 5 years. Extrapulmonary, recurrent cases, and cases before 1999 were excluded. Odds ratios (OR) calculated using logistic regression.

  2. *

    In this model a dummy variable was used for the 32 individuals with missing data on recent residence.

  3. Test for trend.

Table 3

Characteristics associated with transmissibility

https://doi.org/10.7554/eLife.05166.013
CharacteristicAny Linked/TotalAssociation with linkspAdjusted for age, sex, year, lineage, smear statusp (lrtest)
n/N%OR (95% CI)OR (95% CI)
Overall431/134632.0
Lineage
 159/21727.20.87 (0.63–1.2)0.94 (0.66–1.3)
 227/6144.31.7 (1.0–2.7)1.9 (1.1–3.2)
 365/15442.21.6 (1.2–2.3)1.9 (1.4–2.7)
 4280/91430.610.0061<0.001
Age
 <2020/5040.02.3 (1.2–4.4)1.9 (0.98–3.7)
 20–29134/34938.42.3 (1.5–3.3)2.2 (1.5–3.3)
 30–39159/49032.51.7 (1.2–2.5)2.0 (1.3–2.9)
 40–4971/23829.81.6 (1.0–2.4)1.7 (1.1–2.7)
 50+47/21921.51<0.00110.002
Sex
 Female239/71833.311
 Male192/62830.60.87 (0.69–1.1)0.20.93 (0.73–1.2)0.5
Year
 1995–1998159/31450.611
 1999–2001119/34534.50.49 (0.36–0.66)0.42 (0.31–0.58)
 2002–200495/38924.40.30 (0.22–0.41)0.27 (0.19–0.37)
 2005–200758/29819.50.22 (0.16–0.32)<0.0010.20 (0.14–0.29)<0.001
TB type
 Smear pos pulm338/100333.711
 Smear neg pulm93/34327.10.72 (0.55–0.94)0.010.73 (0.55–0.96)<0.001
HIV status
 HIV−91/31828.611
 HIV+ no ART170/54031.51.1 (0.83–1.5)1.1 (0.81–1.6)
 HIV+ on ART11/4822.90.70 (0.35–1.4)0.31.4 (0.62–3.1)0.6
Previous TB
 No391/120032.611
 Yes40/14627.40.77 (0.53–1.1)0.20.85 (0.58–1.3)0.4
INH resistance
 No402/123732.511
 Yes29/10029.00.86 (0.55–1.3)0.50.86 (0.54–1.4)0.5
Recent residence
 Karonga284/94230.211
 Other Malawi80/23434.21.2 (0.89–1.6)1.0 (0.74–1.4)
 Other country20/7427.00.88(0.52–1.5)0.40.57 (0.33–0.98)0.09
Birth place
 Karonga276/81134.011
 Other Malawi80/27229.40.83 (0.62–1.1)0.82 (0.60–1.1)
 Other country64/23427.40.77 (0.56–1.1)0.20.71 (0.51–0.99)0.08
  1. The numbers of likely transmissions (‘forward links’) were compared by individual characteristics using ordered logistic regression. Extrapulmonary cases and cases occurring after 2007 were excluded.

Download links