Comparative Genomics: One for all
There is an old saying in computational circles that researchers in bioinformatics would rather use someone else’s toothbrush than use someone else’s code. One example of this adage being true can be seen in previous attempts to compare the rates at which differences in the mechanisms that control DNA accumulate in different species and lineages.
The information contained in DNA is first accessed by dedicated proteins called transcription factors (TF) that bind to preferred sequence of bases in the DNA. This sequence is typically short, between 8 and 20 bases in length (Vaquerizas et al., 2009), although some can be as long as 35 bases (Filippova et al., 1996). After transcription factor binding has taken place, the basal transcription machinery and its associated complexes open the region’s chromatin and begin transcribing DNA into RNA. These crude transcripts must undergo extensive processing and maturation before they can be exported to the cytoplasm as mature messenger RNA (mRNA). Understanding the rate at which all these steps (notably transcription factor binding and the production of mRNA) change during evolution is a long-standing goal in genetics (Wray, 2007; Wittkopp and Kalay, 2012).
Technically, it is (relatively) easy to map all the contacts between the transcription factors and the DNA, and also to map all the mRNA molecules, in a biological sample using high-throughput sequencing technologies. A number of research groups have compared the amount of transcription factor binding in many species of flies and mammals (He et al., 2011; Paris et al., 2013; Schmidt et al., 2010; Ballester et al., 2014). Based on this work it seemed as if transcription factor binding evolved rapidly in mammalian tissues (Weirauch and Hughes, 2010), but only very slowly in fruit flies (He et al., 2011). However, it can be difficult to compare the first results generated in an entirely novel field of study because different groups often use very different approaches. And in this case this difficulty is further compounded by the toothbrush issue.
Now, in eLife, Trey Ideker and colleagues at the University of California San Diego – including Anne-Ruxandra Carvunis, Tina Wang and Dylan Skola as joint first authors – report that they used a new analysis pipeline to study the raw data for more than 25 species of complex eukaryotes across three animal lineages (mammals, birds and insects) that previously had only been studied in isolation (Carvunis et al., 2015). In other words, they have cleaned everyone’s teeth with the same toothbrush. Moreover, their pipeline could be tweaked to vary the analysis parameters for all the datasets across three lineages at once, thus allowing them to make like-with-like comparisons.
This intellectual scrubbing resulted in two major insights. First, it appears that transcription factor binding (which dictates the function of the genome) and mRNA both evolve at a shared (and perhaps even fundamental) rate in complex eukaryotes. This result is somewhat surprising since most evolutionary geneticists think that the mechanisms that influence genome or functional evolution for the lineages studied by Carvunis et al. are radically different.
Second, particularly in mammals, the evolution of the genome sequence en masse is much more rapid than the evolution of transcription factor binding and transcription. This disconnect may be linked to the instability of the large number largely-silent repeat elements in mammalian genomes, and/or to the fact that insects and birds have more stable genomes.
Moreover, Carvunis et al. have powerfully demonstrated why it is important for all of us in the functional genomics community to meticulously curate our raw data and to make it readily available for others to analyse. None of the insights reported in this work would have been possible without easy access to carefully annotated sequencing reads from the original studies.
References
-
A census of human transcription factors: function, expression and evolutionNature Reviews Genetics 10:252–263.https://doi.org/10.1038/nrg2538
-
Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergenceNature Reviews Genetics 13:59–69.https://doi.org/10.1038/nrg3095
-
The evolutionary significance of cis-regulatory mutationsNature Reviews Genetics 8:206–216.https://doi.org/10.1038/nrg2063
Article and author information
Author details
Publication history
- Version of Record published: February 11, 2016 (version 1)
Copyright
© 2016, Odom
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,574
- views
-
- 180
- downloads
-
- 0
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
Accurate prediction of the structurally diverse complementarity determining region heavy chain 3 (CDR-H3) loop structure remains a primary and long-standing challenge for antibody modeling. Here, we present the H3-OPT toolkit for predicting the 3D structures of monoclonal antibodies and nanobodies. H3-OPT combines the strengths of AlphaFold2 with a pre-trained protein language model and provides a 2.24 Å average RMSDCα between predicted and experimentally determined CDR-H3 loops, thus outperforming other current computational methods in our non-redundant high-quality dataset. The model was validated by experimentally solving three structures of anti-VEGF nanobodies predicted by H3-OPT. We examined the potential applications of H3-OPT through analyzing antibody surface properties and antibody–antigen interactions. This structural prediction tool can be used to optimize antibody–antigen binding and engineer therapeutic antibodies with biophysical properties for specialized drug administration route.
-
- Computational and Systems Biology
- Medicine
Background:
Preterm birth is the leading cause of neonatal morbidity and mortality worldwide. Most cases of preterm birth occur spontaneously and result from preterm labor with intact (spontaneous preterm labor [sPTL]) or ruptured (preterm prelabor rupture of membranes [PPROM]) membranes. The prediction of spontaneous preterm birth (sPTB) remains underpowered due to its syndromic nature and the dearth of independent analyses of the vaginal host immune response. Thus, we conducted the largest longitudinal investigation targeting vaginal immune mediators, referred to herein as the immunoproteome, in a population at high risk for sPTB.
Methods:
Vaginal swabs were collected across gestation from pregnant women who ultimately underwent term birth, sPTL, or PPROM. Cytokines, chemokines, growth factors, and antimicrobial peptides in the samples were quantified via specific and sensitive immunoassays. Predictive models were constructed from immune mediator concentrations.
Results:
Throughout uncomplicated gestation, the vaginal immunoproteome harbors a cytokine network with a homeostatic profile. Yet, the vaginal immunoproteome is skewed toward a pro-inflammatory state in pregnant women who ultimately experience sPTL and PPROM. Such an inflammatory profile includes increased monocyte chemoattractants, cytokines indicative of macrophage and T-cell activation, and reduced antimicrobial proteins/peptides. The vaginal immunoproteome has improved predictive value over maternal characteristics alone for identifying women at risk for early (<34 weeks) sPTB.
Conclusions:
The vaginal immunoproteome undergoes homeostatic changes throughout gestation and deviations from this shift are associated with sPTB. Furthermore, the vaginal immunoproteome can be leveraged as a potential biomarker for early sPTB, a subset of sPTB associated with extremely adverse neonatal outcomes.
Funding:
This research was conducted by the Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services (NICHD/NIH/DHHS) under contract HHSN275201300006C. ALT, KRT, and NGL were supported by the Wayne State University Perinatal Initiative in Maternal, Perinatal and Child Health.