In-host population dynamics of Mycobacterium tuberculosis complex during active disease

  1. Roger Vargas Jr  Is a corresponding author
  2. Luca Freschi
  3. Maximillian Marin
  4. L Elaine Epperson
  5. Melissa Smith
  6. Irina Oussenko
  7. David Durbin
  8. Michael Strong
  9. Max Salfinger
  10. Maha Reda Farhat  Is a corresponding author
  1. Harvard Medical School, United States
  2. National Jewish Health, United States
  3. Icahn School of Medicine at Mount Sinai, United States
  4. University of South Florida, United States

Abstract

Tuberculosis (TB) is a leading cause of death globally. Understanding the population dynamics of TB's causative agent Mycobacterium tuberculosis complex (Mtbc) in-host is vital for understanding the efficacy of antibiotic treatment. We use longitudinally collected clinical Mtbc isolates that underwent Whole-Genome Sequencing from the sputa of 200 patients to investigate Mtbc diversity during the course of active TB disease after excluding 107 cases suspected of reinfection, mixed infection or contamination. Of the 178/200 patients with persistent clonal infection > 2 months, 27 developed new resistance mutations between sampling with 20/27 occurring in patients with pre-existing resistance. Low abundance resistance variants at a purity of ≥19% in the first isolate predict fixation in the subsequent sample. We identify significant in-host variation in twenty-seven genes, including antibiotic resistance genes, metabolic genes and genes known to modulate host innate immunity and confirm several to be under positive selection by assessing phylogenetic convergence across a genetically diverse sample of 20,352 isolates.

Data availability

All Mtbc sequencing data was collected from previously published studies and is publicly available. Individual accession numbers for the Mtbc genomes analyzed in this study can be found in Supplementary File 2 and information on which studies from which the data was generated can be found in the Methods, Figure 1 - figure supplement 1 and Supplementary File 1. All packages and software used in this study have been noted in the Methods. Custom scripts written in python version 2.7.15 were used to conduct all analyses and interfaced via Jupyter Notebooks. Jupyter Notebooks and scripts written for data processing and analysis can be found in the following GitHub repository - https://github.com/farhat-lab/in-host-Mtbc-dynamics

Article and author information

Author details

  1. Roger Vargas Jr

    Department of Biomedical Informatics, Harvard Medical School, Boston, United States
    For correspondence
    roger_vargas@g.harvard.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7116-5211
  2. Luca Freschi

    Department of Biomedical Informatics, Harvard Medical School, Boston, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Maximillian Marin

    Department of Biomedical Informatics, Harvard Medical School, Boston, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. L Elaine Epperson

    Center for Genes, Environment, and Health, National Jewish Health, Denver, United States
    Competing interests
    The authors declare that no competing interests exist.
  5. Melissa Smith

    Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  6. Irina Oussenko

    Icahn Institute of Data Sciences and Genomics Technology, Icahn School of Medicine at Mount Sinai, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  7. David Durbin

    Mycobacteriology Reference Laboratory, Advanced Diagnostic Laboratories, National Jewish Health, Denver, United States
    Competing interests
    The authors declare that no competing interests exist.
  8. Michael Strong

    Center for Genes, Environment, and Health, National Jewish Health, Denver, United States
    Competing interests
    The authors declare that no competing interests exist.
  9. Max Salfinger

    College of Public Health, University of South Florida, Tampa, United States
    Competing interests
    The authors declare that no competing interests exist.
  10. Maha Reda Farhat

    Department of Biomedical Informatics, Harvard Medical School, Boston, United States
    For correspondence
    Maha_Farhat@hms.harvard.edu
    Competing interests
    The authors declare that no competing interests exist.

Funding

National Science Foundation (DGE1745303)

  • Roger Vargas Jr

The authors declare that there was no funding for this work.

Reviewing Editor

  1. Bavesh D Kana, University of the Witwatersrand, South Africa

Version history

  1. Received: August 5, 2020
  2. Accepted: January 25, 2021
  3. Accepted Manuscript published: February 1, 2021 (version 1)
  4. Version of Record published: February 15, 2021 (version 2)

Copyright

© 2021, Vargas et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,700
    views
  • 366
    downloads
  • 36
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Roger Vargas Jr
  2. Luca Freschi
  3. Maximillian Marin
  4. L Elaine Epperson
  5. Melissa Smith
  6. Irina Oussenko
  7. David Durbin
  8. Michael Strong
  9. Max Salfinger
  10. Maha Reda Farhat
(2021)
In-host population dynamics of Mycobacterium tuberculosis complex during active disease
eLife 10:e61805.
https://doi.org/10.7554/eLife.61805

Share this article

https://doi.org/10.7554/eLife.61805

Further reading

    1. Epidemiology and Global Health
    Xiaoxin Yu, Roger S Zoh ... David B Allison
    Review Article

    We discuss 12 misperceptions, misstatements, or mistakes concerning the use of covariates in observational or nonrandomized research. Additionally, we offer advice to help investigators, editors, reviewers, and readers make more informed decisions about conducting and interpreting research where the influence of covariates may be at issue. We primarily address misperceptions in the context of statistical management of the covariates through various forms of modeling, although we also emphasize design and model or variable selection. Other approaches to addressing the effects of covariates, including matching, have logical extensions from what we discuss here but are not dwelled upon heavily. The misperceptions, misstatements, or mistakes we discuss include accurate representation of covariates, effects of measurement error, overreliance on covariate categorization, underestimation of power loss when controlling for covariates, misinterpretation of significance in statistical models, and misconceptions about confounding variables, selecting on a collider, and p value interpretations in covariate-inclusive analyses. This condensed overview serves to correct common errors and improve research quality in general and in nutrition research specifically.

    1. Ecology
    2. Epidemiology and Global Health
    Emilia Johnson, Reuben Sunil Kumar Sharma ... Kimberly Fornace
    Research Article

    Zoonotic disease dynamics in wildlife hosts are rarely quantified at macroecological scales due to the lack of systematic surveys. Non-human primates (NHPs) host Plasmodium knowlesi, a zoonotic malaria of public health concern and the main barrier to malaria elimination in Southeast Asia. Understanding of regional P. knowlesi infection dynamics in wildlife is limited. Here, we systematically assemble reports of NHP P. knowlesi and investigate geographic determinants of prevalence in reservoir species. Meta-analysis of 6322 NHPs from 148 sites reveals that prevalence is heterogeneous across Southeast Asia, with low overall prevalence and high estimates for Malaysian Borneo. We find that regions exhibiting higher prevalence in NHPs overlap with human infection hotspots. In wildlife and humans, parasite transmission is linked to land conversion and fragmentation. By assembling remote sensing data and fitting statistical models to prevalence at multiple spatial scales, we identify novel relationships between P. knowlesi in NHPs and forest fragmentation. This suggests that higher prevalence may be contingent on habitat complexity, which would begin to explain observed geographic variation in parasite burden. These findings address critical gaps in understanding regional P. knowlesi epidemiology and indicate that prevalence in simian reservoirs may be a key spatial driver of human spillover risk.