Impact of HIV co-infection on the evolution and transmission of multidrug-resistant tuberculosis
Abstract
The tuberculosis (TB) epidemic is fueled by a parallel Human Immunodeficiency Virus (HIV) epidemic, but it remains unclear to what extent the HIV epidemic has been a driver for drug resistance in Mycobacterium tuberculosis (Mtb). Here we assess the impact of HIV co-infection on the emergence of resistance and transmission of Mtb in the largest outbreak of multidrug-resistant TB in South America to date. By combining Bayesian evolutionary analyses and the reconstruction of transmission networks utilizing a new model optimized for TB, we find that HIV co-infection does not significantly affect the transmissibility or the mutation rate of Mtb within patients and was not associated with increased emergence of resistance within patients. Our results indicate that the HIV epidemic serves as an amplifier of TB outbreaks by providing a reservoir of susceptible hosts, but that HIV co-infection is not a direct driver for the emergence and transmission of resistant strains.
https://doi.org/10.7554/eLife.16644.001eLife digest
Tuberculosis is an infectious disease caused by a bacterium called Mycobacterium tuberculosis that causes more deaths worldwide than any other infection. Individuals who are infected with the Human Immunodeficiency Virus (HIV), which weakens the immune system, are particularly vulnerable to tuberculosis. However, treating individuals who are infected with both HIV and tuberculosis is complicated because the drugs currently used to treat one infection can interfere with the effectiveness of the drugs used to treat the other.
Tuberculosis is generally treated with antibiotics. However, some strains of M. tuberculosis are difficult to treat as they have evolved to resist the effects of multiple types of antibiotics. These “multidrug-resistant” bacteria appear to be particularly common in areas where HIV infections are also common. However, it was not known whether HIV directly influences whether M. tuberculosis bacteriaevolve into drug-resistant forms.
Eldholm, Rieux et al. have now analyzed the genomes, or total genetic content, of 252 samples of M. tuberculosis taken from the largest outbreak to date of multidrug-resistant tuberculosis in South America. This made it possible to identify the genetic mutations that enable the bacteria to resist antibiotic treatment. Using mathematical models to reconstruct the spread of multidrug resistant M. tuberculosis bacteria during the outbreak also made it possible to assess who transmitted tuberculosis to whom.
The results suggest that M. tuberculosis does not evolve drug resistance any faster in patients with HIV than otherwise. Furthermore, patients infected with both HIV and tuberculosis did not transmit tuberculosis to others more often than patients who did not have HIV. However, being infected with HIV did increase the likelihood that an individual would contract tuberculosis. HIV also increased the rate at which the symptoms of tuberculosis progressed in an individual.
To clarify the effect of HIV on the spread of tuberculosis, similar studies are needed that collect more complete patient data, including their anti-HIV treatment history and their degree of immune weakening.
https://doi.org/10.7554/eLife.16644.002Introduction
Among the estimated 1.5 million people who died from TB in 2013, 360,000 were HIV co-infected and 200,000 cases were caused by multidrug-resistant TB (MDR-TB) (World Health Organization, 2015). Until the late 1980s, reports of MDR-TB were rare, and transmission of such strains was even less frequent (Reves et al., 1981; Small et al., 1993; Wells et al., 2007). The MDR-TB burden surged concurrently with the human immunodeficiency virus (HIV) pandemic and most reported early MDR-TB outbreaks mainly affected HIV co-infected individuals in hospitals and prisons (Small et al., 1993; Wells et al., 2007; Ritacco et al., 1997).
There are good epidemiological reasons to suspect that the HIV and MDR-TB pandemics are fueling each other. Not only does HIV infection render people more susceptible to develop active TB by weakening their immune system, but anti-TB drugs can also directly interfere with antiretroviral treatment. Rifampicin (RIF), one of the cornerstones in anti-TB therapy, has been shown to significantly lower serum concentrations of HIV protease and reverse transcriptase inhibitors (Burman et al., 1999; Centers for Disease Control and Prevention, 1998). To make matters worse, HIV co-infection is also associated with malabsorption of anti-TB drugs. This pattern is particularly pronounced for RIF, but seems to hold true for most anti-TB drugs (Patel et al., 1995; Peloquin et al., 1993).
HIV co-infection might also directly contribute to the accumulation of resistance in Mtb. First, as resistance mutations generally entail a fitness cost to the bacterium (at least initially), some resistant strains might be more successful in HIV+ hosts with weakened immunity leading to a reduced selective pressure on the bacillus. Second, some antiretroviral drugs used to treat HIV might have a mutagenic effect on mycobacterial genomes, but this has yet to be investigated (McGrath et al., 2014).
HIV co-infection and very low CD4 lymphocyte counts (<100 cells/mm3), a hallmark of advanced HIV infection, have been shown to be risk factors for developing resistance to RIF and to a lesser degree isoniazid (INH) (Bradford et al., 1996; Burman et al., 2006; Li et al., 2005; Porco et al., 2013). However, a systematic review of 32 studies assessing MDR-TB prevalence by HIV status did not demonstrate an overall association between acquired MDR-TB and HIV, but suggested that HIV co-infection is a risk factor for contracting primary MDR-TB (Suchindran et al., 2009). In summary, the association between HIV co-infection and Mtb drug resistance remains unclear, with a number of studies yielding conflicting results (Small et al., 1993; Chum et al., 1996; Lukoye et al., 2011; Meyssonnier et al., 2012; Robert et al., 2003). Attempts have also been made to model the impact of HIV on TB incidence and resistance (Sergeev et al., 2012), but in lieu of empirical data, such studies relied on a number of assumptions on both host and pathogen biology as well as the interactions between them.
It is beyond doubt that HIV has been a driver of increased TB incidence globally, but a recent review of the subject actually found HIV co-infection to be associated with decreased rates of TB transmission within households and between close contacts (Kwan and Ernst, 2011). This observation is possibly explained by differing manifestation of TB in HIV positives, namely less frequent cavitation and lower pulmonary bacillary load (Kwan and Ernst, 2011). External factors such as social isolation or HIV infected patients being followed up more closely than HIV negatives may also contribute to this pattern (Kwan and Ernst, 2011). Indeed, in a low-incidence setting of close follow-up, HIV co-infection was associated with reduced TB transmission (inferred by clustering) and TB among HIV co-infected was at least partly due to transmission from HIV-negative patients (Fenner et al., 2012).
In the current work we aimed to directly investigate the impact of HIV co-infection on the evolution of antibiotic resistance emergence and on transmission dynamics. We analyzed the genomes of 252 isolates belonging to the largest reported outbreak of MDR-TB in South America, caused by the M strain (Ritacco et al., 1997; Eldholm et al., 2015). The isolates were collected from patients with known HIV status from the mid-90s until 2009, providing important temporal information. To assess the impact of HIV co-infection on Mtb evolutionary rates, we estimated mutation rates in the terminal branches of a time-labelled phylogenetic tree, roughly corresponding to the evolutionary history of individual clinical Mtb isolates within sampled patients. We also inferred transmission networks by implementing a novel epidemiological model accounting for the long latency of TB. Finally, we estimated the length of the latent period by combining the results of the phylogenetic reconstruction and inferred transmission networks.
We found that HIV status of the host does not affect the mutation rate of Mtb, and that drug resistance is not more likely to evolve in HIV positive than HIV negative patients. Together these findings suggest that HIV co-infection is not a direct driver of Mtb drug resistance, which fits well with the distribution of the global burden of TB, MDR-TB and HIV. Reconstructed transmission networks did not reveal a significant impact of HIV co-infection on the ability of patients to transmit TB. However, our estimates of TB latency confirm that HIV co-infection accelerates progression to active TB.
Results
Impact of HIV co-infection on Mtb mutation rates and resistance development
After filtering out positions with low mapping quality and removal of single nucleotide polymorphisms (SNPs) in problematic regions, a total of 509 SNPs separating the 252 isolates were used to construct a Bayesian phylogeny (Figure 1) (Eldholm et al., 2015). The majority of the isolates in the study shared the same six mutations yielding resistance to INH, RIF, streptomycin, kanamycin, pyrazinamide and ethambutol (Eldholm et al., 2015). The bulk of resistance mutations evolving within the outbreak were thus made up of ethionamide (ETH) and fluoroquinolone (FLQ) resistance mutations. The HIV status was known for all patients in the study, of which 60.7% were HIV positive.

Whole-genome Bayesian evolutionary phylogeny of the M outbreak.
The peripheral color strips indicate the HIV status of patients from which the clinical isolates were collected and the resistance burden of the isolate. The scale bar is given in years since the most recent common ancestor of the outbreak.
Based on the available data we considered that the sequenced outbreak isolates represented about one third of the total number of individuals belonging to the outbreak. Based on available RFLP data and estimates of the proportion of MDR-cases in Argentina belonging to the M strain, the outbreak is believed to have caused about 550 cases between 1992 and 2002, of which 109 genomes were available for study (20%). A large fraction of isolates from before 2001, which includes the peak of the outbreak, were lost in a freezer accident. From 2003 to 2009, the M strain caused 228 cases in Argentina, of which 143 genomes were available (63%), 116 isolates were sequenced from HIV positive patients and 85 from negatives. Lost isolates amounted to 40 positives and 25 negatives. For these years there are hence no reason to suspect any bias in the HIV status of the sampled patients (χ2; p = 0.53). Lost samples can potentially inflate the length of the terminal branches as they can result in missing internal nodes, but any inflation in branch length is thus expected to apply equally to branches leading to isolates sampled from HIV positive and negative patients.
To investigate the impact of HIV-TB coinfection on the accumulation of mutations in Mtb genomes we directly counted the mutations occurring on terminal branches by performing an ancestral reconstruction analysis in PAML (Table 1, Figure 2—figure supplement 1) (Yang, 2007). We observed no significant differences in the rate at which substitutions accumulate in the genomes of strains evolving in HIV positive and negative patients (Figure 2a). However, terminal branches were significantly longer and contained significantly higher numbers of mutations in HIV negative patients than in positives (Table 1 and Figure 2—figure supplement 1), possibly reflecting a slower progression of TB in HIV negatives relative to positives.
Number of SNPs accumulated in clinical isolates.
Host HIV status | n | Mutations total | Mean number per isolate | χ2 p-value |
---|---|---|---|---|
Negative | 99 | 262 | 2.646 | < 0.001 |
Positive | 153 | 277 | 1.810 |

Impact of HIV co-infection on Mtb evolution.
Left: Rate of evolution (substitutions/site/year) on terminal branches (p = 0.1920). Right: resistance load (number of antimicrobials to which resistance-conferring mutations were found in clinical Mtb isolates, stratified by HIV status of the host.
Hypothesizing that HIV-coinfection could either be a direct driver of resistance emergence or increase the susceptibility to contract additionally resistant isolates, we tested whether patient HIV status was associated with resistance load, by counting the number of resistance determinants present in each Mtbisolate (Supplementary file 1: Sample information) and stratifying the data by HIV status. We found that the resistance load was near identical between Mtb isolates from HIV positive and negative patients (mean = 5.99 and 6.04 respectively) (Figures 1 and 2b). These results are in line with those from a very recent study of drug resistant TB in Kwazulu-Natal (Cohen et al., 2015).
The analysis above does not distinguish between mutations that emerged in the patients included in our sample (acquired resistance) and those acquired earlier in unsampled patients and subsequently transmitted to the patients in our sample (primary resistance). To investigate the impact of HIV co-infection on evolution of new resistance, we collected available treatment history for patients from whom isolates with terminal branch resistance mutations had been sampled (Supplementary file 2: Treatment histories). We excluded secondary mutations in resistance genes, namely katG and rpoB mutations, in isolates already harboring high-level resistance mutations in these genes. These mutations could either be random events or be involved in fitness compensation, but not resistance per se. We then excluded isolates collected from patients who had not been treated with drugs relevant for the terminal branch resistance mutation, as these most likely represent mutations that evolved in unsampled cases and subsequently transmitted to a sampled secondary case. This left 13 resistance mutations that evolved with high probability during therapy in 11 patients (Table 2). Nine events of acquired resistance occurred in seven HIV negatives and four in HIV positives. Based on the frequency of HIV co-infection among the sampled patients, HIV negative patients were overrepresented among cases of acquired resistance, but the difference was not significant (p= 0.24, Fisher’s exa ct test). While the sample size is arguably small, this finding does also not implicate HIV as a driver of Mtb drug resistance within the outbreak.
Identified events of within-patient acquired resistance.
Isolate ID | HIV | Treatment history | Mutation | Acquired resistance |
---|---|---|---|---|
107 | - | follow-up (ETH* treated) | ethA L225fs | ETH |
108 | - | follow-up (ETH and FLQ treated) | ethA S208P | ETH |
516 | - | follow-up (unknown treatment) | pncA D129G | PZA |
1757 | - | follow-up (ETH and FLQ treated) | ethA H22P | ETH |
2098 | - | follow-up (ETH and FLQ treated) | ethA F302S + gyrB D461V | ETH + FLQ |
2485 | - | follow-up (unknown treatment) | ethA G437fs | ETH |
POGU | - | follow-up (ETH and FLQ treated) | ethA R259fs + gyrB R292G | ETH + FLQ |
110 | + | follow-up (ETH and FLQ treated) | gyrB R446S | FLQ |
257 | + | follow-up (ETH and FLQ treated) | inhA -15 C>T | ETH |
1298 | + | follow-up (ETH and FLQ treated) | gyrA D94N | FLQ |
2569 | + | follow-up (ETH treated) | ethA S251fs | ETH |
-
*Patient received the ETH analogue prothionamide
Effect of HIV co-infection on TB transmission
To investigate the impact of HIV co-infection on transmission of Mtb, we implemented a new method to infer transmission events based on the timed phylogenetic tree (Figure 1). This was needed because a phylogeny is not directly informative about transmission events as a result of within-host diversity and evolution (Didelot et al., 2016; Pybus and Rambaut, 2009). Our methodology is briefly outlined below and explained in more details in the materials and methods section. A coalescent within-host model (Didelot et al., 2014) was combined with a Susceptible-Exposed-Infectious-Removed (SEIR) epidemiological model (Lekone and Finkenstädt, 2006). The likelihood of transmission from one host to another can be computed under this combined model, and this calculation was performed for all pairs of individuals with one acting as potential infector and the other as potential infectee. The likelihood calculation relies solely on the dates at which the two individuals were sampled, their relative position on the phylogeny, and whether the putative infector was smear positive or negative. It does not incorporate other information such as HIV status, so that these effects can be tested separately.
The SEIR model was set up with parameters for latency (mean of 5 years with 95% CI 46 days – 18.5 years) and infectious period (mean of 120 days with 95% CI 3–443 days). The infectious period includes time from symptom onset to infection clearance. A standard method for diagnosing TB is direct microscopy of sputum smears. If bacteria are visible under the microscope, the case is denoted smear positive. If no bacteria are observed, but Mtb can be cultured from the sputum, the case is culture positive but smear negative. Smear positive cases transmit TB far more efficiently on average than smear negative cases. We thus applied a so-called smear-correction, penalizing transmission event likelihoods involving a smear negative transmitter by multiplying likelihood values with 0.05.
The resulting matrix contains likelihoods of all possible transmission events (Figure 3—source data 1). For each of the 252 sampled cases in the outbreak, we extracted the most likely transmitter, resulting in 251 identified transmission pairs. Examples of transmission graphs and transmission events mapped on the phylogeny are shown in Figure 3 whereas full transmission graphs are presented as figure supplements (Figure 3—figure supplements 1 and 2). Figure 3—source data 2 provides the links between transmission graph nodes and sample IDs. We performed a simulation analysis to test the accuracy of our transmission analysis method, and sensitivity analyses to ensure that our results were robust to parameter choice (see Materials and methods).

Reconstruction of transmission events.
(A) Graphs representing two selected high-likelihood transmission chains. The colors of the edges indicate the probabilities of each transmission event from high (red) to lower (orange). Patient HIV-status is indicated by grey (negative) and blue (positive). (B) The corresponding transmission chains annotated in the timed phylogenetic tree. Red color highlights isolates linked by transmission events from a single source. Branches in magenta indicate subsequent transmission from a secondary case to additional cases (blue).
-
Figure 3—source data 1
Likelihood matrix of all possible pairwise transmission events.
- https://doi.org/10.7554/eLife.16644.009
-
Figure 3—source data 2
Conversion table linking transmission graph nodes and sample IDs.
- https://doi.org/10.7554/eLife.16644.010
Next, we analyzed transmission events as a function of the HIV status of the transmitter of transmitter-receiver pairs. We found no significant effect of HIV status on the ability of patients to cause secondary TB cases (Table 3). Due to incomplete sampling, a proportion of identified transmission pairs are expected to be spurious, as unsampled intermediary hosts go undetected. To account for this, the analyses were repeated including only the most likely transmission events using three thresholds of increasing stringency (top 45%, 35% or 25% most likely transmissions). These subsets are expected to be increasingly enriched for true transmission pairs, but subsampling did not affect the original findings (Table 3). We also explicitly investigated the distribution of the number of transmissions per transmitter to test whether this could be affected by HIV status, but detected no significant differences between HIV-status of transmitters (Table 4). The 25% most likely infection events were mapped onto the time-labelled phylogeny for a visual integration of the modelled transmission links (Figure 3—figure supplement 3).
Number of reconstructed transmission events.
Transmission event cut-off | Donor HIV status | Observed | Expected | Obs/Exp | χ2 p value |
---|---|---|---|---|---|
All transmissions | Negative | 80 | 98.61 | 0.81 | 0.3185 |
Positive | 171 | 152.39 | 1.12 | ||
Top 25% events | Negative | 20 | 24.75 | 0.81 | 0.2205 |
Positive | 43 | 38.25 | 1.12 | ||
Top 35% events | Negative | 30 | 34.57 | 0.87 | 0.3185 |
Positive | 58 | 53.43 | 1.09 | ||
Top 45% events | Negative | 36 | 44.39 | 0.81 | 0.1060 |
Positive | 77 | 68.61 | 1.12 |
Distribution of transmissions as a function of HIV status of transmitter.
All transmission events | |||||||||||||
Transmissions per transmitter: | Kruskal-Wallis p value | ||||||||||||
HIV status | none | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
|
neg | 50 | 37 | 4 | 2 | 1 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0.075 |
pos | 63 | 58 | 11 | 10 | 5 | 2 | 2 | 1 | 0 | 0 | 0 | 1 | |
Top 25% likely transmission events | |||||||||||||
Transmissions per transmitter: |
| ||||||||||||
HIV status | none | 1 | 2 | 3 | 4 | 5 |
| ||||||
neg | 83 | 15 | 0 | 0 | 0 | 1 | 0.304 | ||||||
pos | 121 | 25 | 3 | 4 | 0 | 0 | |||||||
Top 35% likely transmission events | |||||||||||||
Transmissions per transmitter: |
| ||||||||||||
HIV status | none | 1 | 2 | 3 | 4 | 5 |
| ||||||
neg | 75 | 21 | 2 | 0 | 0 | 1 | 0.505 | ||||||
pos | 111 | 33 | 4 | 4 | 0 | 1 | |||||||
Top 45% likely transmission events | |||||||||||||
Transmissions per transmitter: |
| ||||||||||||
HIV status | none | 1 | 2 | 3 | 4 | 5 |
| ||||||
neg | 69 | 27 | 2 | 0 | 0 | 1 | 0.324 | ||||||
pos | 100 | 39 | 7 | 5 | 1 | 1 |
To further assess performance of the epidemiological modelling, we investigated whether six pairs of isolates with known epidemiological links (epi-pairs) had been identified by the transmission analysis. Four pairs of household contacts were identified as likely transmission pairs by the genomic analysis. All four were among the 35% most likely transmission events, and two among the top 25%. The SNP differences between these epi-pairs ranged from one to three SNPs (Supplementary file 3: SNP distances between epi-pairs). The remaining two epi-pairs were not identified as likely transmission events. These included one pair of household contacts and one pair of isolates from the same patient taken 4.5 years apart. The genomic differences were nine and five SNPs respectively, which explains why the model did not identify these as likely transmission pairs. Interestingly, drug resistance had evolved in one of the epidemiologically linked isolates in both of these pairs, but in none of the four other pairs. We previously showed that a large number of mutations can hitchhike in the genetic background of resistance mutations sweeping to fixation and hypothesized that such selective sweeps could potentially confuse the reconstruction of transmission events (Eldholm et al., 2014). These two cases might well exemplify such a situation. However, it cannot be ruled out that the epi-links actually represent independent sources of infection (re-infection in the serially sampled patient).
Effect of HIV on progression of Mtb infection to active TB
We then set out to estimate the effect of HIV co-infection on the length of TB latency. For pairs of samples connected by a transmission event, transmission of Mtb from host A to B must happen after the date of the node connecting the two isolates in the Bayesian phylogeny (Figure 1) (Didelot et al., 2012, 2013). The date of transition from silent infection to active TB is unknown, but must happen before sampling time, when the active status is known. An upward biased estimate for the length of latency period of individual j is therefore given by the difference between the date of the MRCA of the transmitter i and the receiver j (when j was not yet infected) and the date of sampling of j (by which time j had developed active TB). Although this estimator clearly overestimates the latency period, there is no a priori reason to suspect that the bias should be different between HIV negatives and positives. Any significant difference is therefore likely to reflect a difference in length of the actual latency period. Accordingly, we extracted the length (in years) of the branches separating the MRCA and recipient of the transmission pairs and stratified the data by HIV status of the recipient.
As we did not have an exhaustive sampling of all isolates in the outbreak, not all individuals would have donors present in the phylogenetic tree. To account for this, we analyzed branch lengths of the receiver for all 251 inferred transmissions, and separately for the 45%, 35% and 25% best supported transmission events, respectively. Again, we expected the proportion of genuine transmissions to increase in frequency as we restricted the analysis to a smaller subset of the best-supported transmissions. The length of the branches leading to HIV negative hosts was significantly longer than for HIV positive hosts when including all 251 estimated transmission events (p<0.001), and this difference remained significant for all three subsampling regimes (Figure 4).

Estimating latency time as a function of HIV status.
(A) For pairs of samples connected by a transmission event from i to j, transmission of Mtb is expected to have occurred on the terminal branch above j. Even though we do not know exactly when j went from latent TB to active TB, the latent period is included in the length of the terminal branch leading to j (see main text). We therefore use this branch length as an upwardly biased estimate for latency time. (B) For transmission pairs in the calculated transmission networks, the length (in years) of terminal branches leading to the recipient of the pairs (overestimated latency period) was extracted and stratified by HIV status of the recipient. To account for incomplete sampling, the analyses were performed on all 251 calculated transmission events as well as subsets including only the most likely transmission pairs (top 45, 35 and 25%). ***denotes p<0.001, *denotes p<0.05 as determined by unpaired t-test.
When including only the top 25% of transmission events, the average branch lengths were 5.56 versus 4.65 years for HIV negative and positive receivers, respectively. A comprehensive review of 52 studies found that the average time from TB symptom onset to diagnosis (diagnostic delay) is approximately two months, with no significant difference between high and low income countries (Sreeramareddy et al., 2009). Another meta-study found that HIV positive status was associated with both increased and decreased diagnostic delay, depending on study setting (Storla and Yimer, 2008). The study most relevant to the current setting was conducted in 2005 in Buenos Aires and other Argentinean provinces and found a delay of about three months, with no significant effect of HIV status (Zerbini et al., 2008). As the difference in TB activation time we infer between HIV- and HIV+ is several times higher than the diagnostic delay reported in any setting, we feel confident that it reflects faster progression to active TB in HIV+ patients.
The fact that HIV co-infection significantly increases the rate of reactivation of latent TB is well documented. A comprehensive study from the United States found the rate of reactivation to be 25-fold higher in HIV co-infected individuals relative to their HIV-free peers (1.82 vs 0.072 per 100 person-year) (Shea et al., 2014). However, our outbreak analysis is necessarily restricted to people who develop active TB, and in this subset of cases, HIV co-infection seems to be associated with a relatively modest acceleration of TB progression, speeding up the process by about 11 months. We do not know when individual patients contracted TB and HIV respectively. Hypothetically, the accelerating effect of HIV co-infection on TB progression is likely to be underestimated in patients who were infected with HIV significantly later than TB. Conversely, patients who were infected with TB late in the study period might be enriched for HIV co-infection as these patients were more likely to develop TB in time for inclusion in the study. However, as the study period was relatively long, we do not believe this potential bias to have significantly affected our results.
Discussion
The single most important impact of HIV infection in this large multi-decade outbreak of MDR-TB seems to be an increase in the proportion of patients who develop active TB. The HIV prevalence in Argentina is approximately 0.4% (in 2001 and 2014) (World Health Organization, 2013), whereas the proportion of HIV co-infected individuals is 60.7% within the M outbreak. These numbers demonstrate that HIV infection is a massive risk factor for developing TB with the MDR M strain. We found that HIV co-infection is associated with a moderately faster, yet statistically significant, progression to active TB. As this subtle effect of HIV status on time to active TB cannot explain the far higher incidence of the M strain in HIV positives, this suggests that the main effect of HIV co-infection is to increase the absolute risk of developing active TB. In other words, we surmise that a large proportion of HIV negatives infected by the M strain will not progress to active TB but for those that do, the latency period is only slightly longer than for HIV positives.
This study encompasses an outbreak within which resistance to six common anti-TB drugs evolved early on, and our results are thus mainly restricted to the evolution of resistance to second-line drugs such as ETH and FLQ in individual isolates. Extrapolation of these findings to evolution of resistance to first-line drugs thus requires caution. However, the physiological and societal impact of HIV on TB patients as well as the fitness constraints associated with new Mtb resistance mutations should be fundamentally similar regardless of drug class.
It should be noted that free access to highly active antiretroviral therapy in Argentina from 1997 is likely to have mitigated the accelerating effect of HIV on TB progression (Waisman, 2001; Gupta et al., 2015). Clinical data on HIV progression was not available and we were thus unable to quantify the effect of anti-retroviral treatment by stratifying our analyses by CD4 counts. We predict CD4 counts would correlate negatively with TB progression. However, we do not expect that antiviral therapy (and increased CD4 counts) would nullify the effect of HIV on TB progression. Indeed, it has been shown that TB incidence during highly active anti-retroviral treatment is significantly higher than background levels even though a number of possible confounders makes the exact quantification of the effect of antiretroviral therapy challenging (Gupta et al., 2015; Girardi et al., 2005; Lawn et al., 2005; Lawn and Wood, 2005).
Accurate reconstruction of transmission chains of bacteria with extended periods of within-host evolution remains challenging (see Materials and methods). But even though some artefactual transmissions are bound to be included among the reconstructed high-confidence events, we are confident that the overall pattern of transmission is shaped by actual events and is hence robust. Restricting the transmission network analyses to the most likely transmission events did not affect our finding that HIV status does not significantly impact the transmissibility of Mtb (Table 3 and Table 4). We also found that HIV co-infection does not affect the rate of Mtb evolution within patients. In fact, Mtb was found to accumulate more mutations in HIV negatives. This likely reflects the slower progression to active disease in this group, with these patients harboring Mtb for a more extended period relative to HIV positives. This pattern holds true also for antimicrobial resistance mutations, which were found to evolve significantly more often in HIV negatives than in HIV positives.
We previously showed that the largest clade in the M outbreak had evolved resistance to six antimicrobials by 1979, well before the HIV epidemic reached Argentina (Eldholm et al., 2015), a finding which has been replicated for another highly resistant Mtb lineage in South Africa (Cohen et al., 2015). To put our results in a global context, we retrieved data on the burden of TB, MDR-TB and HIV globally from the World Health Organization (WHO) Global Health Observatory Data Repository (http://apps.who.int/gho/data/node.main). We observed a strong correlation between TB and MDR-TB prevalence (Figure 5a) as well as a correlation between HIV and TB burden between countries (Figure 5b). We also recovered a highly significant correlation between HIV and MDR-TB (Figure 5c). However, when correcting the MDR-TB burden for total TB burden, the correlation vanished (Figure 5d). This is in line with our results on the M outbreak that HIV is a driver of TB in general, but does not disproportionately contribute to the rise of MDR-TB lineages.

Correlations between global patterns of HIV, TB and MDR-TB prevalence.
Clockwise: Per country prevalence of MDR-TB as a function of TB prevalence (p=2.2 × 10−16); TB prevalence as a function of HIV prevalence (p=5.9 × 10−6); MDR-TB prevalence as a function of HIV prevalence (p=1.6 × 10−4); Proportion of MDR-TB cases among TB patients as a function of HIV prevalence (p=0.8). All values are log-transformed. The depth of shading of individual dots reflect the TB prevalence in individual countries.
-
Figure 5—source data 1
Global per-country health, economy and disease metrics.
- https://doi.org/10.7554/eLife.16644.018
By combining Bayesian evolutionary analyses and the reconstruction of transmission networks based on a new epidemiological model, we were able to directly assess the impact of HIV on the evolution and transmission of the single most widespread MDR-TB strain reported to date in South America. The main pre-extensively resistant (pre-XDR) clade within the outbreak evolved before the HIV epidemic in Argentina, but HIV patients at a major hospital in Buenos Aires played a central role in fueling the epidemic in the early 1990s (Ritacco et al., 1997; Eldholm et al., 2015), by providing the strain with a large and spatially restricted reservoir of individuals susceptible to develop active TB. Once the outbreak erupted, we find that HIV co-infection did not play a role in accelerating Mtb mutation rates; neither did HIV co-infected patients cause secondary TB cases at significantly higher rates than their HIV negative peers did. Our findings confirm that HIV co-infected patients have increased susceptibility to contract TB, but strongly suggest that they do not drive the evolution of Mtb resistance within an outbreak, nor do they act as super-spreaders of MDR-TB.
Materials and methods
Isolate collection
Request a detailed protocolAll available isolates belonging to the M outbreak as assessed by IS6110 RFLP were included in the study (see (Eldholm et al., 2015) for additional information on samples). The exact number of lost isolates is not known. No IS6110 RFLP data are available for isolates from before 1992; a freezer accident also contributed significantly to sample loss.
Genomic analyses and phylogenetic evolutionary inferences
Request a detailed protocolThe protocols used for DNA isolation, preparation of sequencing libraries and SNP calling are described in (Eldholm et al., 2015), as are the methods for phylogenetic evolutionary inferences, testing of tip-based calibrations and molecular dating. Sequence reads from the study can be found under European Nucleotide Archive accession PRJEB7669. Briefly, BEAST 1.7.4 (Drummond and Rambaut, 2007) was used to infer a phylogeny, branch lengths and evolutionary rates using a general time reversible substitution model with variation among sites simulated using a discrete gamma distribution with four rate categories. We assumed a lognormal relaxed clock to allow variation in rates among branches in the trees. Trees were calibrated using tip dates only with sample time span ranging from October 1996 to December 2009. Following appropriate testing, we applied an exponential demographic model. Posterior distributions of parameters, including branch lengths and substitution rates were estimated by Markov chain Monte Carlo (MCMC) sampling.
Analyzing differences in number of accumulated mutations between Mtb strains evolving in HIV-positive and negative patients
Request a detailed protocolIn this study, we aimed to test for evolutionary differences between strains evolving in HIV positive and negative patients. Because we can only be confident about the HIV status from which the samples were collected from, we restricted these analyses to terminal branches in the tree. We estimated the rates of evolution on terminal branches and compared those leading to HIV- and HIV+ hosts using two sample unpaired t-tests. We used the baseml model implemented in PAML program to perform the empirical Bayesian reconstruction of ancestral sequences. High-likelihood resistance mutations in the genes embB, ethA, gidB, gyrA, gyrB, katG, ndh, mshA, pncA, rpoB, rpsL and rrs were identified as described previously (Eldholm et al., 2015).
Reconstruction of transmission chains and assessment of the impact of HIV co-infection
Request a detailed protocolThe code used to reconstruct transmission events is available at https://github.com/xavierdidelot/TransPairs. We wanted to reconstruct likely transmission events between sampled individuals, with the added difficulty that we knew a significant proportion of infected individuals were not sampled, so that some of the sampled individuals would have been infected by unsampled individuals. To avoid this difficulty, we developed the following inferential framework in which the likelihood of direct transmission from any sampled host to any other can be calculated. We consider a Susceptible-Exposed-Infectious-Removed (SEIR) model where individuals move from E to I at rate γ1 and from I to R at rate γ2. We also assume that within-host coalescence happens at a constant rate α as in previous work (Didelot et al., 2014). We want to calculate the likelihood Li→j of transmission from host i to host j (Figure 4). Let ti and tjdenote the known times at which the two hosts are sampled. Let ti,j denote the time at which the samples from i and j last shared a common ancestor, which is known from the timed phylogeny (Figure 1). Let s denote the unknown time at which i transmitted to j, assuming that this is indeed what happened. s is unknown but is greater than ti,j and smaller than both ti and tj. With these notations:
The first term in the integral is the probability of host i being removed at time ti given that he was infectious at time s and is exponentially distributed with rate γ2:
The second term in the integral is the probability of host j being removed at time tj given that he was exposed at time s and so is a convolution of the exponentials with rates γ1 and γ2:
The third term in the integral is the probability that coalescence of the two lines present in host i at time s happens at time ti,j and also that either i was infectious at time ti,j and stayed so until s or that host i was latent at time ti,j and became infectious (but not removed) by time s, leading to:
By injecting the last three equations into the first we get the likelihood of transmission from i to j. These calculations were made for all putative infector-infectee pairs using γ1 = 0.2 per year and γ2 = 3 per year and the previously estimated within-host coalescent rate α = 0.83 per year (Didelot et al., 2014). The likelihoods of transmission from smear negative individuals was multiplied by 0.05 to reflect the lower infectiousness of these individuals. The SEIR epidemiological model assumed in the calculations above implies that there is random mixing between the individuals, with every infectious individual being a priori equally likely to infect any susceptible individual. Although the assumption of random mixing is appealing in theory, in practice human population are well known to behave differently, with for example a strong effect of the household structure in the transmission patterns of many pathogens (Cauchemez et al., 2004; Whittles and Didelot, 2016). Here we did not have information on the structure of the population and so could not integrate it in our model. Application of our methodology in a setting where such information is available could be performed simply by multiplying the likelihood values with the a priori probabilities of transmission caused by the host population structure.
From the full matrix of transmission likelihoods between all pair of strains, we aimed to reconstruct disease transmission as accurately as possible. For each pair [i,j] of the transmission matrix, we started by removing the lowest likelihood value (i infecting j or j infecting i). From the remaining transmission events, we used Edmonds algorithm implemented in the RBGL R package (Carey et al., 2016) to find the spanning arborescence of minimum weight (sometimes called an optimum branching). An optimum branching is a graph defined as a set of directed edges that contain no cycles and such that no two edges are directed towards the same node. In our reconstruction, such a graph contains n nodes, n being the number of isolates and n-1 directed edges representing the transmission events. As our sampling of the outbreak was not exhaustive, we know that a proportion of direct transmission events did not happen. To deal with that situation, we used various thresholds of inferred transmission events with the highest associated likelihoods to plot the transmission graphs and analyze the distribution of transmission events.
Simulations and sensitivity analyses
Request a detailed protocolIn order to test the accuracy of the above method of reconstruction of transmission chains, we simulated an SEIR model for a population of N = 3000 individuals, with a transmission rate of β = 0.001 per year, a rate of becoming infectious when exposed γ1 = 0.2 per year, and a rate of being removed when infectious γ2 = 3 per year. These values of N and β were selected to produce simulated outbreaks of roughly the same size as in the real data, and these values are not used for inference. The transmission tree generated by this simulation was recorded. A timed phylogeny was then constructed from the transmission tree, using a coalescent within-host evolutionary model with coalescent rate α = 0.83 per year, and leaves were randomly removed from this tree to simulate incomplete sampling of cases, keeping two thirds of the leaves in the second half of the outbreak to emulate the sampling frame in our study. The resulting phylogeny (Figure 6) was then analyzed in exactly the same way as the real data: the likelihood of transmission was computed for every pair of leaves, a transmission tree was deduced using Edmonds algorithm, and only the 25%, 35% or 45% most likely transmission links were retained to account for incomplete sampling. When applying these three thresholds to the simulated data, we found that the proportion of correctly inferred links were 74%, 69% and 63%, respectively. These results conform with our expectation given that there is significant uncertainty about who infected whom based on genomic data alone when accounting for extended periods of within-host evolution (Didelot et al., 2016; Biek et al., 2015; Didelot et al., 2014; Worby et al., 2014).

Timed phylogeny used in simulation of SEIR model.
https://doi.org/10.7554/eLife.16644.020The results of our transmission analysis are based on three parameter values, namely a mean latent period of 5 years, a mean infectious period of 120 days and a smear correction of 0.05 by which the likelihood of transmission from smear negative individuals is multiplied. These parameters were selected based on the literature and clinical experience. The latent period can vary extensively between people. Approximately two months of diagnostic delay (Sreeramareddy et al., 2009) plus two months from treatment onset to clearance of the MDR infection (Brust et al., 2013) suggests that 120 days is a reasonable estimate of infectious period. Finally, A 20-fold decreased transmissibility of smear-negative cases was chosen as a reasonable parameter (Ma et al., 2015). We performed a sensitivity analysis to test how reliable our results would be if any of these parameters were inaccurate. For each of the three parameters, we ran the analysis again considering double and half of their specified values above, and compared the reconstructed transmission links with those of the main analysis. In each case, the proportion of links identical with the main analysis was between 91% and 99%. We also performed an analysis in which no smear correction was applied and recovered 90.5% of the links in the main analysis.
Collection of data on global HIV, TB and MDR-TB burden
Request a detailed protocolHIV prevalence expressed as% population between the age of of 15 and 49 was downloaded form the World Bank Data website (http://data.worldbank.org/indicator/SH.DYN.AIDS.ZS).
TB and MDR-TB prevalence data was obtained from the World Health Organization (http://www.who.int/tb/country/data/download/en/). For TB prevalence, data was available for all countries for the year 2013 and point estimates of prevalence by 100 k individuals were retrieved (e_prev_100 k).
For MDR-TB prevalence, the data was collected less systematically, and relies on a mix of surveillance, surveys and models. We used the estimated number of MDR-TB cases among all notified pulmonary TB cases (e_mdr_num), expressed as prevalence per 100 k individuals by dividing by country population size estimates from the same source. We calculated the proportion of MDR-TB cases by dividing the prevalence of MDR-TB by the prevalence of TB. All four variables (HIV-, TB-, MDR-TB- prevalence and the ratio of MDR-TB/TB prevalence were transformed as log(x+1) prior to analyses. Pearson correlation coefficients were used to test for significant associations between the prevalence of TB and MDR-TB, HIV and TB, HIV and MDR-TB and finally HIV and of the MDR-TB/TB ratio.
The robustness of the prevalence estimates likely vary between countries due to difference in methodology and surveillance effort, which may lead to some biases in the correlations reported in Figure 5. We reasoned that more robust estimates should be obtained in countries with more developed economies and public health institutions.
Thus, we additionally retrieved estimates for 2013 GDP per capita (http://data.worldbank.org/indicator/NY.GDP.PCAP.CD) and health expenditure (%) http://data.worldbank.org/indicator/SH.XPD.TOTL.ZS.
For all countries. Health expenditure was transformed into absolute health expenditure per capita, by multiplying by GDP and dividing by population size of the countries. The source data used in these analyses is provided in Figure 5—source data 1.
We then recomputed the correlations reported in Figure 5 on different fractions (25%, 50% and 75%) of the countries with highest GDP or health expenditure per capita. Prevalence estimates from countries with lower GDP are indeed likely to be less robust as the coefficients between the significant correlations in Figure 5 (panels A, B and C) were substantially higher for the countries with high GDP. However, importantly, we never recovered a significant correlation between the prevalence of HIV and the proportion of TB that were MDR-TB. In Figure 5—figure supplement 1, we report the correlations between the same variables than in Figure 5 for the 50% countries with highest GDP.
Data availability
-
Whole genome analysis of time-structured samples from an outbreak of Mycobacterium tuberculosis M strain in Buenos AiresPublicly available at the EBI European Nucleotide Archive (accession no: PRJEB7669).
References
-
Measurably evolving pathogens in the genomic eraTrends in Ecology & Evolution 30:306–313.https://doi.org/10.1016/j.tree.2015.03.009
-
Acquired rifamycin resistance with twice-weekly treatment of HIV-related tuberculosisAmerican Journal of Respiratory and Critical Care Medicine 173:350–356.https://doi.org/10.1164/rccm.200503-417OC
-
Therapeutic implications of drug interactions in the treatment of human immunodeficiency virus-related tuberculosisClinical Infectious Diseases 28:419–429.https://doi.org/10.1086/515174
-
SoftwareRBGL: An interface to the BOOST graph library, version R package version 1.44.0Bioconductor.
-
A bayesian MCMC approach to study transmission of influenza: application to household longitudinal dataStatistics in Medicine 23:3469–3487.https://doi.org/10.1002/sim.1912
-
Prevention and treatment of tuberculosis among patients infected with human immunodeficiency virus: principles of therapy and revised recommendationsMorbidity and Mortality Weekly Report 47:1–58.
-
Bayesian inference of infectious disease transmission from whole-genome sequence dataMolecular Biology and Evolution 31:1869–1879.https://doi.org/10.1093/molbev/msu121
-
Within-host evolution of bacterial pathogensNature Reviews Microbiology 14:150–162.https://doi.org/10.1038/nrmicro.2015.13
-
BEAST: Bayesian evolutionary analysis by sampling treesBMC Evolutionary Biology 7:214.https://doi.org/10.1186/1471-2148-7-214
-
Mycobacterium tuberculosis transmission in a country with low tuberculosis incidence: role of immigration and HIV infectionJournal of Clinical Microbiology 50:388–395.https://doi.org/10.1128/JCM.05392-11
-
Incidence of tuberculosis among HIV-infected patients receiving highly active antiretroviral therapy in Europe and North AmericaClinical Infectious Diseases 41:1772–1782.https://doi.org/10.1086/498315
-
HIV and tuberculosis: a deadly human syndemicClinical Microbiology Reviews 24:351–376.https://doi.org/10.1128/CMR.00042-10
-
Incidence of tuberculosis during highly active antiretroviral therapy in high-income and low-income countriesClinical Infectious Diseases 41:1783–1786.https://doi.org/10.1086/498308
-
Transmissibility of tuberculosis among school contacts: An outbreak investigation in a boarding middle school, ChinaInfection, Genetics and Evolution 32:148–155.https://doi.org/10.1016/j.meegid.2015.03.001
-
Mutation rate and the emergence of drug resistance in mycobacterium tuberculosisJournal of Antimicrobial Chemotherapy 69:292–302.https://doi.org/10.1093/jac/dkt364
-
Drug malabsorption and resistant tuberculosis in HIV-infected patientsNew England Journal of Medicine 332:336–337.https://doi.org/10.1056/NEJM199502023320518
-
Malabsorption of antimycobacterial medicationsNew England Journal of Medicine 329:1122–1123.https://doi.org/10.1056/NEJM199310073291513
-
Antituberculosis drug resistance acquired during treatment: An analysis of cases reported in California, 1994-2006Clinical Infectious Diseases 56:761–769.https://doi.org/10.1093/cid/cis989
-
Evolutionary analysis of the dynamics of viral infectious diseaseNature Reviews Genetics 10:540–550.https://doi.org/10.1038/nrg2583
-
Transmission of multiple drug-resistant tuberculosis: report of a school and community outbreakAmerican Journal of Epidemiology 113:423–435.
-
Nosocomial spread of human immunodeficiency virus-related multidrug-resistant tuberculosis in Buenos AiresJournal of Infectious Diseases 176:637–642.https://doi.org/10.1086/514084
-
Multidrug-resistant tuberculosis: eight years of surveillance in FranceEuropean Respiratory Journal 22:833–837.https://doi.org/10.1183/09031936.03.00014103
-
Modeling the dynamic relationship between HIV and the risk of drug-resistant tuberculosisScience Translational Medicine 4:135ra67.https://doi.org/10.1126/scitranslmed.3003815
-
Estimated rate of reactivation of latent tuberculosis infection in the United States, overall and by population subgroupAmerican Journal of Epidemiology 179:216–225.https://doi.org/10.1093/aje/kwt246
-
Exogenous reinfection with multidrug-resistant mycobacterium tuberculosis in patients with advanced HIV infectionNew England Journal of Medicine 328:1137–1144.https://doi.org/10.1056/NEJM199304223281601
-
Improved prognosis in HIV/AIDS related multi-drug resistant tuberculosis patients treated with highly active antiretroviral therapyMedicina 61:810–814.
-
HIV Infection and multidrug‐resistant tuberculosis—The perfect stormJournal of Infectious Diseases 196:S86–S107.https://doi.org/10.1086/518665
-
Epidemiological analysis of the eyam plague outbreak of 1665–1666Proceedings of the Royal Society 283:20160618.https://doi.org/10.1098/rspb.2016.0618
-
Global report: UNAIDS report on the global AIDS epidemic
-
Global Tuberculosis Report
-
PAML 4: Phylogenetic analysis by maximum likelihoodMolecular Biology and Evolution 24:1586–1591.https://doi.org/10.1093/molbev/msm088
-
Delay in tuberculosis diagnosis and treatment in four provinces of ArgentinaThe International Journal of Tuberculosis and Lung Disease 12:63–68.
Decision letter
-
Mark JitReviewing Editor; London School of Hygiene & Tropical Medicine, and Public Health England, United Kingdom
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Impact of HIV co-infection on the evolution and transmission of multidrug-resistant tuberculosis" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Prabhat Jha as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
The reviewers and Reviewing Editor were satisfied with the general direction of the analysis and found the work to be sophisticated and to hold an important public health message. However, there were a number of issues around the model and its parameterisation that we felt were inadequate, and which would need to be addressed for the manuscript to be considered further.
Essential revisions:
1) In the first paragraph of the subsection “Effect of HIV co-infection on TB transmission”, you appear to describe a new method for inferring transmission pairs (i.e. person A infected person B), which you have developed. You use a simple SEIR transmission model in determining who infected whom and use an SEIR simulation to help validate the method.
It is important that you further explain this method so that readers can better understand it and researchers will be better able to apply it judiciously in their own work. For example, you conducted a sensitivity analysis on the values used for the latent period and the removal rate but you did not include HIV status in the model, and HIV status could affect each of these parameters. Furthermore, you do not state what mixing assumption you used and likely assumed random mixing. It would be useful if you could explore how the method would perform in the context of non-random mixing (e.g. discuss what the expected effects would be and not necessarily conduct new simulations). You also reported that keeping two-thirds of cases in the second half of the simulated outbreak corresponded to the sampling frame used for the actual genomic data analyses but you need to explain where this number came from (subsection “Simulations and Sensitivity analyses”). If necessary, you should also test the impact of different sampling fractions.
You state "These results conform with our expectation given that there is significant uncertainty about who infected whom based on genomic data alone when accounting for extended periods of within-host evolution" but this high degree of uncertainty does not appear in the conclusions in the main text.
2) In the third paragraph of the subsection “Effect of HIV co-infection on TB transmission”, you tested whether HIV-positive or HIV-negative individuals transmit TB more often conditional on having TB disease. As you suggest, the analysis including all transmissions seems like it would not be very informative since it likely includes many pairs who did not actually transmit between them. While the top X% analyses are affected by the same issue to a lesser extent, there could be another bias in these analyses since enriching for actual transmission pairs would seem to condition on there being at least 1 transmission from the putative source case and thereby eliminates (many of) the instances in which no TB transmission occurred. This would skew the distribution of onward transmissions which could make HIV-positives and negatives appear more similar in terms of their transmission potential than is actually the case.
To provide a simple numerical example of the potential issue, suppose that out of 100 HIV-negative people with active TB 60 of them transmit to 1 other person, while 80 out of 100 TB-HIV coinfected people each transmit TB to 1 person. In this scenario, TB would appear to have the same transmissibility among those who transmitted (i.e. 1 TB transmission per TB transmitter in each HIV group). However, this would give a misleading picture of the actual TB transmission potential of HIV-positive vs. HIV-negative individuals since coinfected people had an 80% probability of transmitting whereas HIV-negative people had a 60% chance of transmitting.
3) In the last paragraph of the subsection “Effect of HIV on progression of Mtb infection to active TB”, you correlate cross-sectional data on HIV and TB prevalences and HIV and MDR-TB prevalences across countries in an effort to determine whether HIV increases TB in general or if it specifically drives the spread of MDR-TB. However, you should explain why this analysis is appropriate in light of the findings of Sergeev et al. (Sci Trans Med 2013, reference 20) who find that the "individual-level association between HIV and drug-resistant forms of TB is dynamic, and therefore, cross-sectional studies that do not report a positive individual-level association will not provide assurance that HIV does not exacerbate the burden of resistant TB in the community."
Also, while the analysis includes HIV and TB/MDR-TB, it's not clear on how it specifically fits in with the rest of paper or if it could be cut without loss of any key information.
You report in Figure 5, the correlation between global patterns of MDR-TB, TB and HIV prevalence. In the Methods you refer to the World Health Organization for the underlying data. This reference should be extended as it now only states "World Health Organization". Data from several countries are known to have limitations in the sense that the prevalence of HIV or TB may not always been known or collected in a rigorous manner. The prevalence of a particular disease can also vary within a single country. Use of these data can therefore also result in biased estimates, as estimates for HIV and TB may be derived from different sites in a single country.
This entire analysis needs to be either strengthened or simply removed from the paper.
4) More explanation is required for key numbers reported in their paper.
In particular, in the Results, you state: "Based on the available data we considered that the sequenced outbreak isolates represented about 35% of the total number of individuals belonging to the outbreak." You should explain how you came up with the figure of 35%. What are you assuming about the fraction of TB cases that are diagnosed and recorded? How, and over what time scale, are you defining the "outbreak" since reactivation TB cases may have been infected decades prior to disease onset?
Again, in the Results, you state: "This left 13 resistance mutations that evolved with high probability during therapy in 11 patients (Table 2). Seven of the patients were HIV negative and four were positive. A χ2 analysis revealed a statistically significant overrepresentation of acquired resistance in HIV-negative cases relative to positives (p = 0.027). This finding strongly suggests that HIV was not a driver of Mtb drug resistance within the outbreak." Given that you indicate that these results strongly back up a key part of your findings, it is important that more information be provided. Specifically, it is unclear exactly which numbers were being compared in the test (e.g. was the null hypothesis that there should be a 50:50 split between HIV-positives and negatives, or was it directly comparing the fraction of mutations found in HIV-infected vs. uninfected individuals [though this would not seem to give the p-value reported]). Also, a Fisher's exact test, not a chi-squared test, would likely be required given that apparently some of the cell sizes were very small.
https://doi.org/10.7554/eLife.16644.026Author response
Essential revisions:
1) In the first paragraph of the subsection “Effect of HIV co-infection on TB transmission”, you appear to describe a new method for inferring transmission pairs (i.e. person A infected person B), which you have developed. You use a simple SEIR transmission model in determining who infected whom and use an SEIR simulation to help validate the method.
It is important that you further explain this method so that readers can better understand it and researchers will be better able to apply it judiciously in their own work. For example, you conducted a sensitivity analysis on the values used for the latent period and the removal rate but you did not include HIV status in the model, and HIV status could affect each of these parameters. Furthermore, you do not state what mixing assumption you used and likely assumed random mixing. It would be useful if you could explore how the method would perform in the context of non-random mixing (e.g. discuss what the expected effects would be and not necessarily conduct new simulations). You also reported that keeping two-thirds of cases in the second half of the simulated outbreak corresponded to the sampling frame used for the actual genomic data analyses but you need to explain where this number came from (subsection “Simulations and Sensitivity analyses”). If necessary, you should also test the impact of different sampling fractions.
We have expanded our description of the methodology used to estimate transmission probabilities, both in the Results and Methods section. For a given putative infector-infectee pair, the calculation depends only on the dates of detection of the two cases, their relative position on the phylogeny, and whether the infector was smear positive or negative. It does not rely on HIV status, so that the effect of HIV status on transmission can be tested independently. We have clarified this important point in the Results section (subsection “Effect of HIV co-infection on TB transmission”, first paragraph). The underlying epidemiological model is a SEIR model, which assumes random mixing of the population, and we have clarified this in the Methods section (subsection “Reconstruction of transmission chains and assessment of the impact of HIV co-infection”, first paragraph) and now discuss how information on the host population structure could be integrated into our inferential framework. The code used to reconstruct transmission events is now available on GitHub https://github.com/xavierdidelot/TransPairs and the paper now includes a pointer to this website (in the aforementioned paragraph, the code should be available now or within a few days). The two-thirds sampling in the latter half of the outbreak reflects the fact that we know the completeness of sampling to be 63% (143/228) in the years 2003 – 2009 (2009 being the last year covered by the study). We have clarified the estimated sample loss in the early and late stages of sampling in the last paragraph of the subsection “Impact of HIV co-infection on Mtb mutation rates and resistance development”.
You state "These results conform with our expectation given that there is significant uncertainty about who infected whom based on genomic data alone when accounting for extended periods of within-host evolution" but this high degree of uncertainty does not appear in the conclusions in the main text.
We agree that this uncertainty could have been acknowledged more clearly in our conclusions. We have now added some reservations regarding the accuracy of the reconstruction of individual transmission events in the Discussion section, fifth paragraph.
2) In the third paragraph of the subsection “Effect of HIV co-infection on TB transmission”, you tested whether HIV-positive or HIV-negative individuals transmit TB more often conditional on having TB disease. As you suggest, the analysis including all transmissions seems like it would not be very informative since it likely includes many pairs who did not actually transmit between them. While the top X% analyses are affected by the same issue to a lesser extent, there could be another bias in these analyses since enriching for actual transmission pairs would seem to condition on there being at least 1 transmission from the putative source case and thereby eliminates (many of) the instances in which no TB transmission occurred. This would skew the distribution of onward transmissions which could make HIV-positives and negatives appear more similar in terms of their transmission potential than is actually the case.
To provide a simple numerical example of the potential issue, suppose that out of 100 HIV-negative people with active TB 60 of them transmit to 1 other person, while 80 out of 100 TB-HIV coinfected people each transmit TB to 1 person. In this scenario, TB would appear to have the same transmissibility among those who transmitted (i.e. 1 TB transmission per TB transmitter in each HIV group). However, this would give a misleading picture of the actual TB transmission potential of HIV-positive vs. HIV-negative individuals since coinfected people had an 80% probability of transmitting whereas HIV-negative people had a 60% chance of transmitting.
We appreciate the reviewers’ thorough assessment of our transmission reconstruction analyses. We agree that the distribution of transmission per potential transmitter is indeed worth a closer examination than we presented in the original submission. That said, we believe that the potential bias identified by the reviewers as illustrated by their numerical example is based on a misunderstanding of how these analyses were actually conducted. Based on the general depth and quality of the reviewers’ comments, this suggests to us we should explain our methodology better, as pointed out by the reviewers under point 1.
In the original submission we did not quantify the number of transmitters, but the number of reconstructed transmission events. We will illustrate this point with the same numerical example: 60% of HIV- and 80% of HIV+ transmitting to one secondary case and assuming we can perfectly reconstruct all transmissions. If we had counted the number of transmissions per person who have transmitted, we would indeed find 1 for both HIV+ and HIV-. However, what we counted were the number of transmissions and for the same numerical example we would thus have observed 4/3 more transmissions from HIV+ relative to HIV-.
This being said, we recognize that skewed distributions of transmissions, e.g. in the form of a limited number of transmitters responsible for a large portion of transmissions within one of the groups could indeed mask differences in transmissibility. We therefore extended our analyses to investigate the distribution of transmissions in more detail. For each subset of transmissions (all and top-X%) we counted the number of transmissions per transmitter (0, 1, 2, … n) and used non-parametric Kruskal-Wallis tests to assess whether the distributions were significantly different. When all transmission events were included, HIV-positive transmitters tended towards having transmitted more often than HIV-negatives, but the difference was not significant. After application of the various top-X% cut-offs the two groups became even more similar (Table 4). This suggests that even when taking into account the distribution of transmissions per transmitter, there is no major impact of HIV co-infection on transmissibility.
The extended results are included in Table 4 and discussed in the fourth paragraph of the subsection “Effect of HIV co-infection on TB transmission” in the revised manuscript.
3) In the last paragraph of the subsection “Effect of HIV on progression of Mtb infection to active TB”, you correlate cross-sectional data on HIV and TB prevalences and HIV and MDR-TB prevalences across countries in an effort to determine whether HIV increases TB in general or if it specifically drives the spread of MDR-TB. However, you should explain why this analysis is appropriate in light of the findings of Sergeev et al. (Sci Trans Med 2013, reference 20) who find that the "individual-level association between HIV and drug-resistant forms of TB is dynamic, and therefore, cross-sectional studies that do not report a positive individual-level association will not provide assurance that HIV does not exacerbate the burden of resistant TB in the community."
Also, while the analysis includes HIV and TB/MDR-TB, it's not clear on how it specifically fits in with the rest of paper or if it could be cut without loss of any key information.
You report in Figure 5, the correlation between global patterns of MDR-TB, TB and HIV prevalence. In the Methods you refer to the World Health Organization for the underlying data. This reference should be extended as it now only states "World Health Organization". Data from several countries are known to have limitations in the sense that the prevalence of HIV or TB may not always been known or collected in a rigorous manner. The prevalence of a particular disease can also vary within a single country. Use of these data can therefore also result in biased estimates, as estimates for HIV and TB may be derived from different sites in a single country.
This entire analysis needs to be either strengthened or simply removed from the paper.
We appreciate our conclusion may seem in variance with the sentence copied from the Abstract of the paper by Sergeev et al. However, we feel it is important to put their statement in its context. The Sergeev et al. paper deals with the dynamics of drug resistant TB (DRTB) over the course of the HIV pandemic in a single setting (Zimbabwe). They focus on a quantity defined as the “ratio of the proportion of TB cases that are drug resistant among HIV-seropositive individuals to the proportion of TB cases that are drug resistant among HIV-seronegative individuals”.
The paper highlights the interesting behavior over time of this ratio, but this is not a quantity we consider. The paper does not explore the effect of different rates of incidence/prevalence of HIV and does not make predictions for other settings than the situation in Zimbabwe they model. The sentence from their Abstract makes an interesting point about the behavior of this ratio in their simulations and states that depending if it is recorded early or late in the same epidemic, it can take different values. However, we do not think the statement can be expanded to warn against cross-comparisons devoid of individual HIV/TB status over multiple settings all recorded at the same time, late in the HIV pandemic (in our case 2013).
The only figure from the Sergeev paper that can be directly compared to the quantities we consider in our correlations is their Figure 3B, which reports the simulated incidence of TB and DRTB in the presence or absence of HIV. Interestingly the simulated patterns they report are actually in line with our global correlations. Except in the very early stages of the epidemic (before ~1990), populations experiencing an HIV epidemic also suffer a higher burden of TB (our panel 5B), a higher burden of MDR-TB (our panel 5C), but the ratio of MDR-TB/ TB remains constant (our panel 5D).
The Sergeev paper also makes a couple of seemingly reasonable assumptions that are likely important determinants of the modelled individual-level rates of drug resistant TB by HIV status as well as overall resistant/susceptible ratio over time. Their model assumes a lower fitness threshold in HIV+ individuals, which necessarily leads to resistant strains (that have lower fitness) being more likely to cause disease in HIV positives (although the model includes compensatory mechanisms that reduce the fitness cost over time in resistant bacteria).
Less important for individual-level relationships but probably an important shaper of the modelled overall rates of drug resistant TB is the parametrization of transmissibility in the model, which assumes that HIV+ TB patients transmit TB less efficiently. It is difficult for us to assess this in detail, but the combination of these two assumptions (1. HIV+ patients are more prone to become infected and contagious with drug resistant TB. 2. HIV+ transmit TB less efficiently than HIV-), that are expected to have opposing effects on the modelled association between HIV prevalence and DR-TB/TB ratio, might explain why the overall MDR/DR/DS ratio is constant irrespective of the presence of HIV (Figure 3B in the Sergeev paper).
However, as our goal was to assess the effect of HIV on transmission, resistance and latency using real data, we did not parametrize HIV+ and HIV- people separately, as this would defeat the point of our analyses. There is however (surprisingly) little support in our results for different fitness thresholds depending on HIV status (both resistance load and estimated Mtb mutation rates are identical between HIV classes) and also little support for a negative association between HIV co-infection and Mtb transmissibility.
The strength of WHO epidemiological data is their availability for most of the world. Even if country-wide summaries can mask some heterogeneities at a smaller scale, we believe the high level of replication (large number of countries) allows capturing robust qualitative global patterns. However, we recognize that there are differences in methodologies between countries and that data from low income countries are likely to be less reliable (which could lead to biases).
To ensure the robustness of the correlations we report, we have now tested whether GDP per capita or public health expenditure per capita might color our results. Limiting the analyses to countries with the highest GDP or health expenditure did indeed increase the correlations between HIV prevalence, TB prevalence and MDR_TB prevalence but did never lead to a significant correlation between HIV prevalence and the ratio of MDR-TB over susceptible TB prevalence (new Figure 5 and new Figure 5—figure supplement 1).
We appreciate these correlations are not directly linked to the rest of the analyses in the paper. In order to confirm our findings in multiple settings, it will be important to replicate our analyses to other outbreaks or epidemics. However, we feel that the inclusion of Figure 5 offers some support for our findings being general throughout different settings. That said, we recognize that the best place for opening up the debate on the generality of our results is probably not the Results section, and we have now moved the figure and the associated text to the Discussion.
Finally, we agree that the rationale, justification and methodological details were inappropriately detailed in the previous draft and wish to apologize for this oversight. We are now far more explicit on the origin of the data and the analysis and have attached the source data to Figure 5 (Figure 5—source data 1). The changes to the manuscript are listed below”
Inclusion of a fourth panel in Figure 5, showing the correlation between TB prevalence and MDR-TB prevalence;
Figure 5 moved to Discussion and a re-write of the section in the text pertaining to Figure 5 (Discussion, fifth paragraph);
Description of the data and analyses in Methods (subsection “Collection of data on global HIV, TB and MDR-TB burden”);
Inclusion of a supplementary figure with the same correlations than in Figure 5 but limited to the countries with top 50% GDP per capita.
4) More explanation is required for key numbers reported in their paper.
In particular, in the Results, you state: "Based on the available data we considered that the sequenced outbreak isolates represented about 35% of the total number of individuals belonging to the outbreak." You should explain how you came up with the figure of 35%. What are you assuming about the fraction of TB cases that are diagnosed and recorded? How, and over what time scale, are you defining the "outbreak" since reactivation TB cases may have been infected decades prior to disease onset?
We have added an explanation for our estimates of sampling density in the second paragraph of the subsection “Impact of HIV co-infection on Mtb mutation rates and resistance development”. We now refer to one third rather than 35% which is actually more accurate (252 sequenced genomes out of an estimated total of 778). The outbreak start is assumed to be 1992, when the outbreak was detected (Ritacco et al. J infect Dis 1997). We have however shown that the MRCA of the large six-drug resistant clade of the outbreak had evolved already by around 1979, but the strain seems to have circulated in very small numbers prior to 1992, when it caused an outbreak at the Muniz hospital in Buenos Aires (Eldholm et al. 2015 Nat Commun).
Again, in the Results, you state: "This left 13 resistance mutations that evolved with high probability during therapy in 11 patients (Table 2). Seven of the patients were HIV negative and four were positive. A χ2 analysis revealed a statistically significant overrepresentation of acquired resistance in HIV-negative cases relative to positives (p = 0.027). This finding strongly suggests that HIV was not a driver of Mtb drug resistance within the outbreak." Given that you indicate that these results strongly back up a key part of your findings, it is important that more information be provided. Specifically, it is unclear exactly which numbers were being compared in the test (e.g. was the null hypothesis that there should be a 50:50 split between HIV-positives and negatives, or was it directly comparing the fraction of mutations found in HIV-infected vs. uninfected individuals [though this would not seem to give the p-value reported]). Also, a Fisher's exact test, not a chi-squared test, would likely be required given that apparently some of the cell sizes were very small.
We would like to thank the reviewer for identifying this rather weak piece of statistics. The expected numbers are based on the fraction of HIV positives and negatives in the sampled outbreak. We have now redone the statistics using the Fisher’s exact test, which indeed shows that the difference is not significant:
obs | exp | |
---|---|---|
neg | 9 | 5 |
pos | 4 | 8 |
p=0.24 |
We would however like to stress that we were aware that the numbers are too small to draw strong conclusions and that we originally stated (as you acknowledge above) that these results strongly suggested that HIV co-infection was not a driver of Mtb resistance, rather than the much stronger alternative, namely that HIV co-infection could somehow be said to be “protective” against resistance development. We still feel that the results back up a key finding, as the distributions do not deviate from the 0-hypothesis that there is no difference between the two groups, i.e. HIV is not a direct driver of resistance emergence in the study setting. We have updated the subsection “Impact of HIV co-infection on Mtb mutation rates and resistance development” (last paragraph) accordingly and moderated our statement.
https://doi.org/10.7554/eLife.16644.027Article and author information
Author details
Funding
European Research Council (260801-BIG_IDEA)
- Francois Balloux
Norges Forskningsråd (221562)
- Vegard Eldholm
National Institute for Health Research (HPRU-2012-10080)
- Xavier Didelot
National Institute for Health Research (UCL Hospitals Biomedical Research Centre)
- Francois Balloux
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
At the time of submission, the corresponding authors were unable to reach Julia Montana Lopez (co-author) and were as such unable to obtain a final approval from her.
Reviewing Editor
- Mark Jit, London School of Hygiene & Tropical Medicine, and Public Health England, United Kingdom
Publication history
- Received: April 5, 2016
- Accepted: July 18, 2016
- Version of Record published: August 9, 2016 (version 1)
Copyright
© 2016, Eldholm et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 3,041
- Page views
-
- 560
- Downloads
-
- 36
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Biochemistry and Chemical Biology
- Microbiology and Infectious Disease
Darunavir (DRV) is exceptional among potent HIV-1 protease inhibitors (PIs) in high drug concentrations that are achived in vivo. Little is known about the de novo resistance pathway for DRV. We selected for resistance to high drug concentrations against ten PIs and their structural precursor DRV. Mutations accumulated through two pathways (anchored by protease mutations I50V or I84V). Small changes in the inhibitor P1'-equivalent position led to preferential use of one pathway over the other. Changes in the inhbitor P2'-equivalent position determined differences in potency that were retained in the resistant viruses and that impacted the selected mutations. Viral variants from the two pathways showed differential selection of compensatory mutations in Gag cleavage sites. These results reveal the high level of selective pressure that is attainable with fifth generation PIs, and how features of the inhibitor affect both the resistance pathway and the residual potency in the face of resistance.
-
- Evolutionary Biology
- Microbiology and Infectious Disease
Increasing numbers of small proteins with diverse physiological roles are being identified and characterized in both prokaryotic and eukaryotic systems, but the origins and evolution of these proteins remain unclear. Recent genomic sequence analyses in several organisms suggest that new functions encoded by small open reading frames (sORFs) may emerge de novo from noncoding sequences. However, experimental data demonstrating if and how randomly generated sORFs can confer beneficial effects to cells are limited. Here we show that by up-regulating hisB expression, de novo small proteins (≤ 50 amino acids in length) selected from random sequence libraries can rescue Escherichia coli cells that lack the conditionally essential SerB enzyme. The recovered small proteins are hydrophobic and confer their rescue effect by binding to the 5' end regulatory region of the his operon mRNA, suggesting that protein binding promotes structural rearrangements of the RNA that allow increased hisB expression. This study adds RNA regulatory elements as another interacting partner for de novo proteins isolated from random sequence libraries, and provides further experimental evidence that small proteins with selective benefits can originate from the expression of nonfunctional sequences.