Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross

  1. Kaushik Renganaath
  2. Rockie Chong
  3. Laura Day
  4. Sriram Kosuri
  5. Leonid Kruglyak  Is a corresponding author
  6. Frank Wolfgang Albert  Is a corresponding author
  1. University of Minnesota, United States
  2. University of California, Los Angeles, United States

Abstract

Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5,832 natural DNA variants in the promoters of 2,503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.

Data availability

Raw data and barcode assignments to oligos are available under GEO accession GSE155944. Source Data is provided for Figures 2, 3, 4, 5, and 6. Additional processed data and the MPRA design are available as Supplementary Files.

The following data sets were generated
The following previously published data sets were used

Article and author information

Author details

  1. Kaushik Renganaath

    Department of Genetics, Cell Biology, & Development, University of Minnesota, Minneapolis, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1010-3604
  2. Rockie Chong

    Department of Chemistry & Biochemistry, University of California, Los Angeles, Los Angeles, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Laura Day

    Department of Human Genetics, University of California, Los Angeles, Los Angeles, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Sriram Kosuri

    Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4661-0600
  5. Leonid Kruglyak

    Department of Human Genetics, University of California, Los Angeles, Los Angeles, United States
    For correspondence
    LKruglyak@mednet.ucla.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8065-3057
  6. Frank Wolfgang Albert

    Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, United States
    For correspondence
    falbert@umn.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1380-8063

Funding

National Institutes of Health (R35GM124676)

  • Frank Wolfgang Albert

Howard Hughes Medical Institute

  • Leonid Kruglyak

Pew Charitable Trusts

  • Frank Wolfgang Albert

Alfred P. Sloan Foundation

  • Frank Wolfgang Albert

Kinship Foundation

  • Sriram Kosuri

Department of Energy, Labor and Economic Growth (DE-FC02-02ER63421)

  • Sriram Kosuri

National Institutes of Health (R01GM102308)

  • Leonid Kruglyak

National Institutes of Health (DP2GM114829)

  • Sriram Kosuri

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2020, Renganaath et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,900
    views
  • 299
    downloads
  • 22
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Kaushik Renganaath
  2. Rockie Chong
  3. Laura Day
  4. Sriram Kosuri
  5. Leonid Kruglyak
  6. Frank Wolfgang Albert
(2020)
Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross
eLife 9:e62669.
https://doi.org/10.7554/eLife.62669

Share this article

https://doi.org/10.7554/eLife.62669

Further reading

    1. Genetics and Genomics
    Tade Souaiaia, Hei Man Wu ... Paul F O'Reilly
    Research Article

    The use of siblings to infer the factors influencing complex traits has been a cornerstone of quantitative genetics. Here, we utilise siblings for a novel application: the inference of genetic architecture, specifically that relating to individuals with extreme trait values (e.g. in the top 1%). Inferring the genetic architecture most relevant to this group of individuals is important because they are at the greatest risk of disease and may be more likely to harbour rare variants of large effect due to natural selection. We develop a theoretical framework that derives expected distributions of sibling trait values based on an index sibling’s trait value, estimated trait heritability, and null assumptions that include infinitesimal genetic effects and environmental factors that are either controlled for or have combined Gaussian effects. This framework is then used to develop statistical tests powered to distinguish between trait tails characterised by common polygenic architecture from those that include substantial enrichments of de novo or rare variant (Mendelian) architecture. We apply our tests to UK Biobank data here, although we note that they can be used to infer genetic architecture in any cohort or health registry that includes siblings and their trait values, since these tests do not use genetic data. We describe how our approach has the potential to help disentangle the genetic and environmental causes of extreme trait values, and to improve the design and power of future sequencing studies to detect rare variants.

    1. Biochemistry and Chemical Biology
    2. Genetics and Genomics
    Federico A Vignale, Andrea Hernandez Garcia ... Adrian G Turjanski
    Research Article

    Yerba mate (YM, Ilex paraguariensis) is an economically important crop marketed for the elaboration of mate, the third-most widely consumed caffeine-containing infusion worldwide. Here, we report the first genome assembly of this species, which has a total length of 1.06 Gb and contains 53,390 protein-coding genes. Comparative analyses revealed that the large YM genome size is partly due to a whole-genome duplication (Ip-α) during the early evolutionary history of Ilex, in addition to the hexaploidization event (γ) shared by core eudicots. Characterization of the genome allowed us to clone the genes encoding methyltransferase enzymes that catalyse multiple reactions required for caffeine production. To our surprise, this species has converged upon a different biochemical pathway compared to that of coffee and tea. In order to gain insight into the structural basis for the convergent enzyme activities, we obtained a crystal structure for the terminal enzyme in the pathway that forms caffeine. The structure reveals that convergent solutions have evolved for substrate positioning because different amino acid residues facilitate a different substrate orientation such that efficient methylation occurs in the independently evolved enzymes in YM and coffee. While our results show phylogenomic constraint limits the genes coopted for convergence of caffeine biosynthesis, the X-ray diffraction data suggest structural constraints are minimal for the convergent evolution of individual reactions.