1. Evolutionary Biology
  2. Genetics and Genomics
Download icon

Natural variation in C. elegans arsenic toxicity is explained by differences in branched chain amino acid metabolism

  1. Stefan Zdraljevic
  2. Bennett William Fox
  3. Christine Strand
  4. Oishika Panda
  5. Francisco J Tenjo
  6. Shannon C Brady
  7. Tim A Crombie
  8. John G Doench
  9. Frank C Schroeder
  10. Erik C Andersen  Is a corresponding author
  1. Northwestern University, United States
  2. Cornell University, United States
  3. Broad Institute of MIT and Harvard, United States
  4. The Buck Institute for Research on Aging, United States
  • Cited 1
  • Views 655
  • Annotations
Cite this article as: eLife 2019;8:e40260 doi: 10.7554/eLife.40260

Abstract

We find that variation in the dbt-1 gene underlies natural differences in Caenorhabditis elegans responses to the toxin arsenic. This gene encodes the E2 subunit of the branched-chain α-keto acid dehydrogenase (BCKDH) complex, a core component of branched-chain amino acid (BCAA) metabolism. We causally linked a non-synonymous variant in the conserved lipoyl domain of DBT-1 to differential arsenic responses. Using targeted metabolomics and chemical supplementation, we demonstrate that differences in responses to arsenic are caused by variation in iso-branched chain fatty acids. Additionally, we show that levels of branched chain fatty acids in human cells are perturbed by arsenic treatment. This finding has broad implications for arsenic toxicity and for arsenic-focused chemotherapeutics across human populations. Our study implicates the BCKDH complex and BCAA metabolism in arsenic responses, demonstrating the power of C. elegans natural genetic diversity to identify novel mechanisms by which environmental toxins affect organismal physiology.

Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).

https://doi.org/10.7554/eLife.40260.001

Introduction

An estimated 100 million people are currently at risk of chronic exposure to arsenic, a toxic metalloid that can be found in the environment (Ravenscroft et al., 2009). The high prevalence of environmental arsenic and the severe toxicity associated with exposure has made it the number one priority for the United States Agency for Toxic Substances and Disease Registry (https://www.atsdr.cdc.gov/SPL/). Inorganic trivalent arsenic As(III) compounds, which include arsenic trioxide (As2O3), are the most toxic forms of environmental arsenic (Ratnaike, 2003; Mandal and Suzuki, 2002). In humans, As(III) is detoxified by consecutive methylation events, forming dimethylarsenite (DMA) (Khairul et al., 2017; Stýblo et al., 2002). However, this methylation process also creates the highly toxic monomethylarsenite (MMA) intermediate, so ratios of DMA to MMA determine levels of arsenic toxicity. Both MMA and DMA are produced from As(III) via the arsenic methyltransferase (AS3MT) (Schlebusch et al., 2015). Interestingly, individuals from human subpopulations that inhabit high arsenic environments have higher DMA/MMA ratios than individuals from low-arsenic environments. The elevated DMA/MMA ratio in these individuals is associated with natural differences in the AS3MT gene (Chung et al., 2009; Fujihara et al., 2009; Gomez-Rubio et al., 2010), which shows signs of strong positive selection. These results suggest that a more active AS3MT enzyme in these human subpopulations makes more DMA and enables adaptation to elevated environmental arsenic levels (Schlebusch et al., 2015). Importantly, population-wide differences in responses to environmental arsenic cannot be explained solely by variation in AS3MT, indicating that other genes must impact arsenic toxicity.

Despite its toxicity, arsenic trioxide has been used as a therapeutic agent for hundreds of years. Most recently, it was introduced as a highly effective cancer chemotherapeutic for the treatment of acute promyelocytic leukemia (APL) (Chen et al., 1997; Antman, 2001; Murgo, 2001; Emi, 2017). Hematopoietic differentiation and apoptosis in APL patients is blocked at the level of promyelocytes by the Promyelocytic Leukemia/Retinoic Acid Receptor alpha fusion protein caused by a t(15;17) chromosomal translocation (de Thé et al., 1990; Grignani et al., 2000). Arsenic trioxide has been shown to directly bind a cysteine-rich region of the RING-B box coiled-coil domain of PML-RARα,which causes the degradation of the oncogenic fusion protein (Zhang et al., 2010; Tomita et al., 2013). The success of arsenic trioxide (Trisenox) has spurred its use in over a hundred clinical trials in the past decade (Hoonjan et al., 2018). Despite these successes, individual differences in the responses to arsenic-based treatments, including patient-specific dosing regimens and side effects, limit the full therapeutic benefit of this compound (Zeidan and Gore, 2014). Medical practitioners require knowledge of the molecular mechanisms for how arsenic causes toxicity to provide the best individual-based therapeutic benefits.

Studies of the free-living roundworm Caenorhabditis elegans have greatly facilitated our understanding of basic cellular processes (Kniazeva et al., 2004; Luz et al., 2017; Spracklin et al., 2017; Watson et al., 2013), including a number of studies that show that the effects of arsenic are similar to what is observed in mammalian model systems and humans. These effects include mitochondrial toxicity (Luz and Meyer, 2016; Luz et al., 2016), the generation of reactive oxygen species (ROS) (Schmeisser et al., 2013), genotoxicity (Wyatt et al., 2017), genome-wide shifts in chromatin structure (Large et al., 2016), reduced lifespan (Schmeisser et al., 2013), and the induction of the heat-shock response (Wang et al., 2017). However, these studies were all performed in the genetic background of the standard C. elegans laboratory strain (N2). To date, 152 C. elegans strains have been isolated from various locations around the world (Andersen et al., 2012; Cook et al., 2016; Cook et al., 2017), which contain a largely unexplored pool of genetic diversity much of which could underlie adaptive responses to environmental perturbations (Zdraljevic and Andersen, 2017).

We used two quantitative genetic mapping approaches to show that a major source of variation in C. elegans responses to arsenic trioxide is caused by natural variation in the dbt-1 gene, which encodes an essential component of the highly conserved branched-chain α-keto acid dehydrogenase (BCKDH) complex (Jia et al., 2016). The BCKDH complex is a core component of branched-chain amino acid (BCAA) catabolism, which has not been previously implicated in arsenic responses. Furthermore, we show that a single missense variant in DBT-1(S78C), located in the highly conserved lipoyl-binding domain, underlies phenotypic variation in response to arsenic. Using targeted and untargeted metabolomics and chemical rescue experiments, we show that differences in wild isolate responses to arsenic trioxide are caused by differential synthesis of mono-methyl branched chain fatty acids (mmBCFA), metabolites with a central role in development (Kniazeva et al., 2004). These results demonstrate the power of using the natural genetic diversity across the C. elegans species to identify mechanisms by which environmental toxins affect physiology.

Results

Natural variation on chromosome II underlies differences in arsenic responses

We quantified arsenic trioxide sensitivity in C. elegans using a high-throughput fitness assay that utilizes the COPAS BIOSORT (Andersen et al., 2015; Zdraljevic and Andersen, 2017). In this assay, three L4 larvae from each strain were sorted into arsenic trioxide or control conditions. After four days of growth, we quantified various attributes of populations that relate to the ability of C. elegans to grow in the presence of arsenic trioxide or control conditions (see Materials and methods). To determine an appropriate concentration of arsenic trioxide for mapping experiments, we performed dose-response experiments on four genetically diverged isolates of C. elegans: N2, CB4856, JU775, and DL238 (Figure 1—figure supplement 1Figure 1—figure supplement 1—source data 1). To assess arsenic-induced toxicity, we studied four independently measured traits: brood size, animal length, optical density, and fluorescence (see Materials and methods). When compared to control conditions, all four strains produced fewer progeny at all arsenic trioxide concentrations, and the lowest concentration at which we observed a significant reduction in brood size for all strains was 1 mM (Figure 1—figure supplement 1A). We used statistical summaries of measurements of individual animals as replicated identical genotypes to estimate broad-sense heritability (H2) across the four strains, but this analysis might not represent the effects of arsenic on individuals within natural populations. For the brood size trait in 1 mM arsenic trioxide, we calculated H2 to be 0.65 (Figure 1—figure supplement 3—source data 1) and the strain effect to be 0.48 (partial omega squared, ωp2, Figure 1—figure supplement 3—source data 2), indicating that this trait has a large genetic component and a large strain effect. In addition to brood size effects, we observed that the progeny of animals exposed to arsenic trioxide were shorter in length than the progeny of animals grown in control conditions (Figure 1—figure supplement 1B), which indicates an arsenic-induced developmental delay (animal length (mean.TOF) H2 = 0.13; Figure 1—figure supplement 3—source data 1 and ωp2 = 0.09; Figure 1—figure supplement 3—source data 2). As C. elegans develop, the animals increase in optical density. Therefore, it is not surprising that we found an arsenic-induced decrease in optical density of progeny populations, which is further support for an arsenic-induced developmental delay (Figure 1—figure supplement 1C; H2 = 0.19; Figure 1—figure supplement 3—source data 1 and ωp2 = 0.07;Figure 1—figure supplement 3—source data 2). We also observed an arsenic-induced effect on yellow autofluorescence (Figure 1—figure supplement 1D; H2 = 0.50; Figure 1—figure supplement 3—source data 1 and ωp2 = 0.21; Figure 1—figure supplement 3—source data 2). Overall, the CB4856 strain was less affected by arsenic – that strain produced approximately 16% more offspring that were on average 20% larger than the other three strains when treated with 1 mM arsenic trioxide. These results suggest that the CB4856 strain was more resistant to arsenic trioxide than the other three strains. In addition to the BIOSORT-quantified traits, we generated a synthetic principal component (PC) trait using the four quantified traits described above (see Materials and methods, Figure 1—figure supplement 1—source data 1, Figure 1—figure supplement 2—source data 1 and Figure 1—source data 1). For 1 mM arsenic trioxide, we estimated the broad-sense heritability (H2) of the first principal component to be 0.37 (Figure 1—figure supplement 3—source data 1) with an effect size of 0.001 (ωp2) (Figure 1—figure supplement 3—source data 2). The first principal component explained a large fraction (0.75) of the total phenotypic variance within the experiment, which is likely because the four input traits correlate with arsenic-induced toxicity (Figure 1—figure supplement 2A; Figure 1—figure supplement 1—source data 3). We noted that the first principal component (PC1) was most strongly influenced by the optical density trait, as indicated by the loadings (Figure 1—figure supplement 2B; Figure 1—figure supplement 3—source data 2 and Figure 1—source data 1), suggesting that PC1 is a biologically relevant trait (Figure 1—figure supplement 1E). Furthermore, because we observe a large range of effect sizes and broad-sense heritability estimates across measured traits (Figure 1—figure supplement 3), we focused our analyses on the PC1 trait derived from the four BIOSORT-quantified described above for all subsequent experiments (Materials and methods).

The increased arsenic trioxide resistance of CB4856 compared to N2 motivated us to perform linkage mapping experiments with a panel of recombinant inbred advanced intercross lines (RIAILs) that were previously constructed through ten generations of intercrossing between an N2 derivative (QX1430) and CB4856 (Andersen et al., 2015). To capture arsenic trioxide-induced phenotypic differences, we exposed a panel of 252 RIAILs to 1 mM arsenic trioxide and corrected for growth differences among RIAILs in control conditions and assay-to-assay variability using linear regression (Figure 1—source data 2; see Materials and methods). We performed linkage mapping on processed traits and the eigenvector-transformed traits (principal components or PCs) obtained from PCA that explained 90% of the variance in the processed trait set (Materials and methods). The rationale of this approach was to minimize trait fluctuations that could be caused by only measuring the phenotypes of one replicate per RIAIL strain, and PC1 captured overall arsenic-induced toxicity. In agreement with our observations from the dose-response experiment, we found that PC1 captures 69.5% of the total measured trait variance and is strongly influenced by animal length traits (Figure 1—figure supplement 4; Figure 1—figure supplement 11—source data 1 and 2). Linkage mapping analysis of the PC1 trait revealed that arsenic trioxide-induced phenotypic variation is significantly associated with genetic variation on the center of chromosome II (Figure 1—figure supplement 4; Figure 1—source data 3). An additional quantitative trait locus (QTL) was significantly associated with variation in arsenic responses on chromosome X (Figure 1—figure supplement 4). Consistent with the loadings of PC1, we determined that PC1 is highly correlated with the both brood size and animal length traits (Figure 1B), suggesting that PC1 captures RIAIL variation in these traits. To further support this relationship to interpretable biological significance, we found that the four traits used as input for PCA all map to the same region on the center of chromosome II (Figure 1—figure supplement 5; Figure 1—source data 3). The QTL on the center of chromosome II explains 35.4% of the total RIAIL phenotypic variation for the PC1 trait, which accounts for 63.4% of the total phenotypic variation that can be explained by genetic factors (H2 = 0.56) (Figure 1—figure supplement 6; Figure 1—figure supplement 6—source data 1). Taken together, the two QTL identified by mapping the PC1 trait account for 40% of the total RIAIL variation, corresponding to 71.6% of the total phenotypic variation that can be explained by genetic factors. However, we did not account for errors in genomic heritability estimates. In addition to the two QTL that explain variation of the PC1 trait, we identified a QTL on chromosome I for the brood size and optical density traits, and a QTL on chromosome V that explained variation in animal length and optical density upon arsenic exposure (Figure 1—figure supplement 7; Figure 1—source data 3). The PC1 QTL confidence interval spans from 7.04 to 8.87 Mb on chromosome II. This QTL overlaps with the brood size (6.18-9.31 Mb) and animal length (6.92-8.70 Mb) QTL confidence intervals and is identical to the optical density and fluorescence QTL (Figure 1—figure supplement 5; Figure 1—source data 3). However, each of these QTL confidence intervals span genomic regions greater than 1.5 megabases and contain hundreds of genes that vary between the N2 and CB4856 strains.

Figure 1 with 11 supplements see all
A large-effect QTL on the center of chromosome II explains differences in arsenic trioxide response between N2 and CB4856.

(A) Linkage mapping plots for the first principal component trait in the presence of 1000 µM arsenic trioxide is shown. The significance values (logarithm of odds, LOD, ratio) for 1454 markers between the N2 and CB4856 strains are on the y-axis, and the genomic position (Mb) separated by chromosome is plotted on the x-axis. The associated 1.5 LOD-drop confidence intervals are represented by blue boxes. The phenotypic variance explained by each QTL is shown above the peak QTL marker, which is marked by red triangles. (B) The correlation between brood size (blue; r2 = 0.38, p-value=1.65E-27) or animal length (pink; r2 = 0.74, p-value=3.16E-74) with the first principal component trait. Each dot represents an individual RIAIL’s phenotype, with the animal length and brood size phenotype values on the x-axis and the first principal component phenotype on the y-axis. (C) Tukey box plots of near-isogenic line (NIL) phenotype values for the first principal component trait in the presence of 1000 µM arsenic trioxide is shown. NIL genotypes are indicated below the plot as genomic ranges. The N2 trait is significantly different than the CB4856 and NIL traits (Tukey HSD p-value<1E-5).

https://doi.org/10.7554/eLife.40260.002

Next, we constructed near-isogenic lines (NILs) to isolate and narrow the chromosome II QTL in a controlled genetic background. We introgressed genomic regions from the CB4856 strain on the left and right halves of the confidence interval into the N2 genetic background. In the presence of arsenic trioxide, both of these NILs recapitulated the parental CB4856 PC1 phenotype (Figure 1C; Figure 1—source data 4) and had similar trait values for the four traits used as inputs into the PCA (Figure 1—figure supplement 8; Figure 1—source data 4). Furthermore, we showed that similar to the RIAIL phenotypes, the measured traits were correlated (Figure 1—figure supplement 9A; Figure 1—figure supplement 9—source data 1) and contributed similarly to the PC1 trait (Figure 1—figure supplement 9B; Figure 1—figure supplement 9—source data 2). Furthermore, the PC1 trait was highly correlated with the four input traits (Figure 1—figure supplement 10). The phenotypic similarity of these NILs to the CB4856 parental strain suggested that the two NILs might share an introgressed region of the CB4856 genome. To identify this shared introgressed region, we performed low-coverage whole-genome sequencing of the NIL strains and defined the left and right bounds of the CB4856 genomic introgression to be from 5.75 to 8.02 Mb and 7.83 to 9.66 Mb in the left and right NILs, respectively (Figure 1—source data 5). The left and right NILs recapitulate 70.6% and 81.9% of the effect size difference between N2 and CB4856 as measured by Cohen’s F, respectively (Cohen, 2013), which exceeds our observations the linkage mapping results where the QTL on chromosome II explained 63.4% of the total phenotypic variation in the RIAIL population. This discrepancy was observed likely because the NILs are a more homogenous genetic background, and the experiment was performed at higher replication than the linkage mapping. We observed similar levels of phenotypic recapitulation for the four traits used as inputs for the PCA (brood size: 57.8% and 87.4%, animal length: 98.5% and 100%; 69.5% and 68.1%; fluorescence: 58.5% and 64.2% for the left and right NILs). Taken together, these results suggested that genetic differences between N2 and CB4856 within 7.83 to 8.02 Mb on chromosome II conferred resistance to arsenic trioxide.

In parallel to the linkage-mapping approach described above, we performed a genome-wide association (GWA) mapping experiment by quantifying the responses to arsenic trioxide for 86 wild C. elegans strains (Figure 2—figure supplement 1—source data 1) (Andersen et al., 2012). Consistent with previous experiments, the PC1 trait was influenced less by the brood size trait, as indicated by the loadings (Figure 2—figure supplement 1Figure 2—source data 1 and Figure 5—figure supplement 1—source data 1). In agreement with the results from the linkage mapping approach, PC1 differences among the wild isolates mapped to a QTL on the center of chromosome II that spans from 7.6 Mb to 8.21 Mb (Figure 2AFigure 2—figure supplement 3—source data 1 and Figure 2—figure supplement 5—source data 1). However, we noted that the brood size trait did not map to a significant QTL with the GWA mapping approach, which is most likely due to the lower statistical power of this approach. Interestingly, the genomic estimates of broad- and narrow-sense heritability (H2h2) were low for all of the wild isolates measured and principal component traits (Figure 2—figure supplement 2Figure 2—source data 2), which could be because the center of chromosome II has not experienced the chromosome-scale selective sweeps (Andersen et al., 2012) that contribute to much of the population structure within the species. The marker found to be most correlated with the PC1 trait from GWA mapping (II:7,931,252), explains 84.6% of the total heritable phenotypic variation. In addition to the PC1 trait, three of the four measured traits also mapped to significant QTL on the center of chromosome II. (Figure 2—figure supplement 1Figure 2—figure supplement 1—source data 1). Notably, the CB4856 strain, which was one of the parents used to construct the RIAIL panel used for linkage mapping, had the non-reference genotype at the marker most correlated with PC1 (Figure 2B), suggesting that the same genetic variant(s) might be contributing to the differential arsenic trioxide response between the RIAIL and wild isolate populations.

Figure 2 with 6 supplements see all
Variation in C. elegans wild isolates responses to arsenic trioxide maps to the center of chromosome II.

(A) A manhattan plot for the first principal component in the presence of 1000 µM arsenic trioxide is shown. Each dot represents an SNV that is present in at least 5% of the assayed wild population. The genomic position in Mb, separated by chromosome, is plotted on the x-axis and the -log10(p) for each SNV is plotted on the y-axis. SNVs are colored red if they pass the genome-wide Bonferroni-corrected significance (BF) threshold, which is denoted by the gray horizontal line. SNVs are colored pink if they pass the genome-wide eigen-decomposition significance (ED) threshold, which is denoted by the dotted gray horizontal line. The genomic region of interests surrounding the QTL that pass the BF and ED thresholds are represented by cyan and pink rectangles, respectively. (B) Tukey box plots of phenotypes used for association mapping in (A) are shown. Each dot corresponds to the phenotype of an individual strain, which is plotted on the y-axis. Strains are grouped by their genotype at the peak QTL position (red SNV from panel A, ChrII:7,931,252), where REF corresponds to the allele from the reference N2 strain. The N2 (orange) and CB4856 (blue) strains are highlighted. (C) Fine mapping of the chromosome II region of interest (cyan region from panel A, 7.60–8.21 Mb) is shown. Each dot represents an SNV present in the CB4856 strain. The association between the SNV and first principal component is shown on the y-axis and the genomic position of the SNV is shown on the x-axis. Dots are colored by their SnpEff predicted effect.

https://doi.org/10.7554/eLife.40260.030

To fine map the PC1 QTL, we focused on variants from the C. elegans whole-genome variation dataset (Cook et al., 2016) that are shared among at least 5% of the 86 wild isolates exposed to arsenic trioxide. Under the assumption that the linkage and GWA mapping QTL are caused by the same genetic variation, we only considered variants present in the CB4856 strain. Eight markers within the QTL region are in complete linkage disequilibrium with each other and are most correlated with the PC1 trait (Figure 2—figure supplement 2; Figure 2—source data 1). Only one of these markers is located within an annotated gene (dbt-1) and is predicted to encode a cysteine-to-serine variant at position 78 (C78S). Although it is possible that the causal variant underlying differential arsenic trioxide response in the C. elegans population is an intergenic variant, we focused on the DBT-1(C78S) variant as a candidate to test for an effect on arsenic response.

A cysteine-to-serine variant in DBT-1 contributes to arsenic response variation

The C. elegans dbt-1 gene encodes the E2 component of the branched-chain α-keto acid dehydrogenase complex (BCKDH) (Jia et al., 2016). The BCKDH complex is a core component of branched-chain amino acid (BCAA) catabolism and catalyzes the irreversible oxidative decarboxylation of amino-acid-derived branched-chain α-ketoacids (Adeva-Andany et al., 2017). The BCKDH complex belongs to a family of α-ketoacid dehydrogenases that include pyruvate dehydrogenase (PDH) and α-ketoglutarate dehydrogenase (KGDH) (Bergquist et al., 2009). All three of these large enzymatic complexes include a central E2 component that is lipoylated at one critical lysine residue (two residues in PDH). The function of these enzymatic complexes depends on the lipoylation of these lysine residues (Bergquist et al., 2009; Reed and Hackert, 1990). In C. elegans, the putative lipoylated lysine residue is located at amino acid position 71 of DBT-1, which is in close proximity to the C78S residue that we found to be highly correlated with arsenic trioxide resistance.

To confirm that the C78S variant in DBT-1 contributes to differential arsenic trioxide responses, we used CRISPR-Cas9-mediated genome editing to generate allele-replacement strains by changing the C78 residue in the N2 strain to a serine and the S78 residue in the CB4856 strain to a cysteine. When treated with arsenic trioxide, the N2 DBT-1(S78) allele-replacement strain recapitulated 56.4% of the phenotypic difference between the CB4856 and N2 strains as measured with the first principal component (Cohen’s F) (Cohen, 2013) (Figure 3; Figure 1—source data 4). Similarly, the CB4856 DBT-1(C78) allele-replacement strain recapitulated 64.8% of the total phenotypic difference between the two parental strains. The degree to which the allele-replacement strains recapitulated the difference in the PC1 trait between the N2 and CB4856 strains matched our observations from the linkage mapping experiment, where the chromosome II QTL explained 63.4% of the total phenotypic variation in the RIAIL population. This result suggested that the majority of heritable variation in arsenic trioxide response was explained by the DBT-1(C78S) allele. We obtained similar results for the BIOSORT-quantified traits (Figure 3—figure supplement 1; Figure 1—source data 4), suggesting that overall animal physiology is affected by arsenic exposure (Figure 1—figure supplement 9; Figure 3—figure supplement 2; Figure 1—figure supplement 11—source data 2 and Figure 1—source data 3). However, when considering brood size, the N2 DBT-1(C78S) allele-replacement strain produced an intermediate number of progeny in the presence of arsenic trioxide relative to the parental N2 and CB4856 strains. And the CB4856 DBT-1(S78C) allele-replacement strain produced fewer offspring than both parental strains (Figure 3—figure supplement 1; Figure 1—source data 4). These results suggested that additional genetic variants between the N2 and CB4856 strains might interact with the DBT-1(C78S) allele to affect different aspects of physiology. Nevertheless, these results functionally validated that the DBT-1 C78S variant underlies differences in physiological responses to arsenic trioxide.

Figure 3 with 2 supplements see all
The DBT-1(C78S) variant contributes to arsenic trioxide responses.

Tukey box plots of the first principal component generated by PCA on allele-replacement strainphenotypes measured by the COPAS BIOSORT 1000 μM arsenic trioxide exposure are shown (N2,orange; CB4856, blue; allele replacement strains, gray). Labels correspond to the genetic backgroundand the corresponding residue at position 78 of DBT-1 (C for cysteine, S for serine). All pair-wise comparisons are significantly different (Tukey HSD, p-value < 1E-7).

https://doi.org/10.7554/eLife.40260.046

Arsenic trioxide inhibits the DBT-1 C78 allele

Mono-methyl branched chain fatty acids (mmBCFA) are an important class of molecules that are produced via BCAA catabolism (Kniazeva et al., 2004; Jia et al., 2016; Kniazeva et al., 2008; Baugh, 2013). The production of mmBCFA requires the BCKDH, fatty acid synthase (FASN-1), acetyl-CoA carboxylase (POD-2), fatty acyl elongases (ELO-5/6), β-ketoacyl dehydratase (LET-767), and acyl CoA synthetase (ACS-1) (Kniazeva et al., 2004; Jia et al., 2016; Kniazeva et al., 2008; Watts and Ristow, 2017; Entchev et al., 2008; Zhu et al., 2013). Strains that lack functional elo-5, elo-6, or dbt-1 produce less C15ISO and C17ISO mmBCFAs, arrest at the L1 larval stage, and can be rescued by supplementing the growth media with C15ISO or C17ISO (Kniazeva et al., 2004; Jia et al., 2016; Kniazeva et al., 2008) (Figure 4A).

Figure 4 with 9 supplements see all
Differential production of mmBCFA underlies DBT-1(C78)-mediated sensitivity to arsenic trioxide.

(A) A simplified model of BCAA catabolism in C. elegans. The BCKDH complex, which consists of DBT-1, catalyzes the irreversible oxidative decarboxylation of branched-chain ketoacids. The products of thesebreakdown can then serve as building blocks for the mmBCFA that are required for developmental progression. (B) The difference in the C15ISO/C15SC (left panel) or C17ISO/C17SC (right panel) ratios between 100 μM arsenic trioxide and control conditions is plotted on the y-axis for three independent replicates of the CB4856 and CB4856 allele replacement strains and six independent replicates of the N2and N2 allele replacement strains. The difference between the C15 ratio for the CB4856-CB4856 allele replacement comparison is significant (Tukey HSD p-value = 0.0427733), but the difference between the C17 ratios for these two strains is not (Tukey HSD p-value = 0.164721). The difference between the C15and C17 ratios for the N2-N2 allele replacement comparisons are both significant (C15: Tukey HSD p-value = 0.0358; C17: Tukey HSD p-value = 0.003747). (C) Tukey box plots median animal length after arsenic trioxide or arsenic trioxide and 0.64 μM C15ISO exposure are shown (N2, orange; CB4856, blue; allele replacement strains, gray). Labels correspond to the genetic background and the corresponding residue at position 78 of DBT-1 (C for cysteine, S for serine). Every pair-wise strain comparison is significant except for the N2 DBT-1(S78) - CB4856 comparisons (Tukey’s HSD p-value < 1.43E-6).

https://doi.org/10.7554/eLife.40260.049

Because DBT-1 is involved in BCAA catabolism, we hypothesized that the DBT-1(C78S)-dependent difference in progeny length between the N2 and CB4856 strains after arsenic trioxide treatment might be caused by differential larval arrest through depletion of downstream mmBCFAs. To test this hypothesis, we quantified the abundance of the monomethyl-branched (ISO) and straight-chain (SC) forms of C15 and C17 in the N2, CB4856, and allele-replacement genetic backgrounds. We measured the metabolite levels in staged L1 animals and normalized the detected amounts of C15ISO and C17ISO relative to the abundances of C15SC and C17SC, respectively, to mitigate the confounding effects of differences in developmental rates that could be caused by genetic background differences after arsenic trioxide exposure. Generally, the ratios of C15ISO/C15SC and C17ISO/C17SC were reduced in arsenic-treated animals relative to controls (Figure 4B; Figure 4—source datas 13).

However, arsenic trioxide treatment had a 7.6-fold stronger effect on the C15ISO/C15SC ratio in N2, which naturally has the C78 allele, than on the N2 DBT-1(S78) allele replacement strain. This difference suggests that the DBT-1(C78) allele is more strongly inhibited by arsenic trioxide (0.04 to 0.004, Tukey HSD p-value=0.0358, n = 6). Similarly, we observed a 6.6-fold arsenic-induced reduction in the C17ISO/C17SC ratio when comparing the N2 DBT-1(C78) and N2 DBT-1(S78) strains (Tukey HSD p-value=0.003747, n = 6). When comparing the CB4856 DBT-1(S78) and CB4856 DBT-1(C78) strains, we observed a 2.8-fold lower C15ISO/C15SC ratio (Tukey HSD p-value=0.0427733, n = 3) and 1.5-fold lower C17ISO/C17SC ratio (Tukey HSD p-value=0.164721, n = 3) in the in the CB4856 DBT-1(C78) strain. We noted that the C17ISO/straight-chain ratio difference was not significantly different between the two CB4856 genetic background strains. However, we observed a significant arsenic-induced decrease in raw C17ISO production in the CB4856 DBT-1(C78) strain (Tukey HSD p-value=0.029) and no significant difference in the CB4856 DBT-1(S78) strain (Tukey HSD p-value=0.1) (Figure 4—figure supplement 1). Importantly, these DBT-1(C78S) allele-specific reductions in ISO/straight-chain ratios were not caused by arsenic-induced differences in straight-chain fatty acids (Figure 4—figure supplement 2). These results explained the majority of the physiological differences between the N2 and CB4856 strains in the presence of arsenic trioxide (Figure 3) and suggested that the DBT-1(C78) allele was inhibited by arsenic trioxide more strongly than DBT-1(S78). Taken together, the differential reduction in branched-chain fatty acids likely underlies the majority of physiological differences between the sensitive and resistant C. elegans strains.

In addition to arsenic-induced differences in branched chain fatty acid production, we observed significant differences in branched/straight-chain ratios between the parental and allele replacement strains when L1 larval animals were grown in control conditions (Figure 4—figure supplement 3; Figure 4—source datas 13). Strains with the DBT-1(C78) had higher ISO/SC ratios relative to strains with the DBT-1(S78) for the C17 (CB4856 DBT-1(C78): Tukey HSD p-value=0.0342525, n = 3; N2 DBT-1(C78): Tukey HSD p-value=0.0342525, n = 6) and C15 ratios (CB4856: Tukey HSD p-value=0.0168749, n = 3; N2: Tukey HSD p-value=0.1239674, n = 6) (Figure 4—figure supplement 3; Figure 4—source datas 13). We noted that the C15ISO/straight-chain ratio was not significantly different when comparing the N2 and N2 allele replacement strain, but the direction of effect matched our other observations, and we saw significant differences in C15ISO levels (N2-C15ISO DBT-1(C78): Tukey HSD p-value=0.0265059, n = 6, Figure 4—figure supplement 4; Figure 4—source datas 13). Importantly, the DBT-1 allele-specific differences in the fatty acid ratio and ISO measurements were not caused by differences in straight-chain fatty acids (Figure 4—figure supplement 4). However, we did not observe the same effect of the DBT-1(C78S) allele at the young adult life stage (Figure 4—figure supplement 5; Figure 4—figure supplement 4—source data 1). Taken together, these results suggest that the DBT-1(C78) allele produces more branched chain fatty acids than the DBT-1(S78) allele, but this effect was dependent on the developmental stage of the animals.

To test the hypothesis that differential arsenic-induced depletion of branched-chain fatty acids in strains with the DBT-1(C78S) causes physiological differences in growth, we tested if mmBCFA supplementation could rescue the effects of arsenic trioxide-induced toxicity. We exposed the parental and the DBT-1 allele-replacement strains to media containing arsenic trioxide alone, C15ISO alone, or a combination of arsenic trioxide and C15ISO. In agreement with previous experiments, the PC1 trait was more strongly correlated with the animal length, optical density, and fluorescence traits than the brood size trait (Figure 4—figure supplement 67; Figure 4—source data 4, Figure 4—figure supplement 5—source data 1 and Figure 4—figure supplement 2—source data 1). C15ISO supplementation of the arsenic growth media caused a 53.5% rescue of the allele-specific effect in the N2 genetic background (Figure 4—figure supplement 6). Similarly, when arsenic-exposed CB4856 DBT-1(C78) animals were supplemented with C15ISO, the allele-specific PC1 phenotypic difference was reduced by 25.6% when compared to the difference between the CB4856 DBT-1(C78) and CB4856 DBT-1(S78) strains in arsenic trioxide alone (Figure 4—figure supplement 6C). By contrast, CB4856 DBT-1(S78) and N2 DBT-1(S78) phenotypes were unaffected by C15ISO supplementation in arsenic trioxide media. We observed similar trends for the animal length, optical density, and fluorescence traits that we used as inputs for PCA but not for brood size (Figure 4—figure supplement 9C). Collectively, these data support the hypothesis that the cysteine/serine variant in DBT-1 contributes to arsenic sensitivity in C. elegans by reducing ISO fatty acid biosynthesis, and the DBT-1(C78) variant can be partially rescued by supplementation with mmBCFAs.

Arsenic exposure increases mmBCFA production and favors a cysteine allele in human DBT1

To test whether our results from C. elegans translate to human variation in arsenic sensitivity, we tested the role of DBT1 variation on arsenic trioxide responses and mmBCFA biosynthesis in human cells. The human DBT1 enzyme contains a serine at position 112 that corresponds to the C78 residue in C. elegans DBT-1 (Figure 4—figure supplement 6). Using CRISPR-Cas9, we edited batch cultures of 293 T cells to generate a subset of cells with DBT1(S112C). These cells were exposed to arsenic trioxide or control conditions, and we monitored changes in the fraction of cells carrying the DBT1(C112) allele. We found that arsenic exposure caused a 4% increase in the fraction of cells that contained the DBT1(C112) allele (Figure 5B, Fisher’s exact test, p-value<1.9E-16; Figure 5—source datas 12). Though the human DBT1 does not vary within the human population at S112, two residues in close spatial proximity to S112 do vary among individuals in the population (R113C and W84C) (Figure 5A) (Forbes et al., 2008). To test the effects of these residues on arsenic trioxide sensitivity, we performed the same editing and arsenic selection procedure described above. Over the course of the selection experiment, cells with DBT1(W84C) and DBT1(R113C) increased by 2% and 1%, respectively (Figure 5B). Therefore, it appears that all three missense variants caused a slight increase in fitness in batch-edited human cell cultures exposed to arsenic – the opposite result we found in C. elegans. To determine if branched-chain fatty acid production was altered by arsenic exposure, as we found in C. elegans, we measured mmBCFA production in unedited 293 T cells in arsenic and mock-treated cultures. We found that overall fatty acid production was markedly reduced in arsenic-treated cultures. In contrast to our observations in C. elegans, straight-chain fatty acids were more drastically reduced than ISO fatty acids (Figure 5—source data 3), suggesting pleiotropic effects and a general perturbation of fatty acid metabolism.

Figure 5 with 1 supplement see all
Protective effect of cysteine residues in human DBT1.

(A) Alignment of C. elegans DBT-1 and H. sapiens DBT1. The residues tested for an arsenic-specific effect are indicated with arrows - W84C (pink), S112C (blue), and R113C (black). The lysine that is post-translationally modified with a lipoid acid is highlighted in red. (B) The percent increase of edited human cells that contain the W84C, S112C, or R113C amino acid change in DBT1 in the presence 5 µM arsenic trioxide relative to control conditions are shown. The number of reads in 5 µM arsenic trioxide for all replicates are significantly different from control conditions (Fisher’s exact test, p-value<0.011).

https://doi.org/10.7554/eLife.40260.066

Discussion

In this study, we characterized the effects of C. elegans natural genetic variation on physiological responses to the pervasive environmental toxin arsenic trioxide. Though the effects of this toxin have been extensively studied in a variety of systems (Ratnaike, 2003; Mandal and Suzuki, 2002; Bergquist et al., 2009; Paul et al., 2014; Shen et al., 2013), recent evidence from human population studies have revealed local adaptations within region-specific subpopulations (Schlebusch et al., 2015; Fujihara et al., 2009; Gomez-Rubio et al., 2010; Li et al., 2017). Our investigation into the natural variation in C. elegans responses to arsenic trioxide led to the discovery of a novel mechanism by which this compound could elicit toxicity. We show that arsenic trioxide differentially inhibits two natural alleles of the E2 domain of the BCKDH complex, which is encoded by the dbt-1 gene. Specifically, strains with the DBT-1(C78) allele are more sensitive to arsenic trioxide than strains carrying the DBT-1(S78). Furthermore, we show that the increased sensitivity of the DBT-1(C78) allele is largely explained by differences in the production of mmBCFAs (Figure 4B–C), which are critical molecules for developmental progression beyond the first larval stage. Arsenic is thought to inhibit the activity of both the pyruvate dehydrogenase (PDH) and the α-ketoglutarate (KGDH) dehydrogenase complexes through interactions with the reduced form of lipoate (Bergquist et al., 2009), which is the cofactor for the E2 domain of these complexes. Like the PDH and KGDH complexes, the E2 domain of BCKDH complex requires the cofactor lipoate to perform its enzymatic function (Pettit et al., 1978; Heffelfinger et al., 1983; Yeaman, 1989). The inhibition of DBT-1 by arsenic trioxide could involve three-point coordination of arsenic by the C78 thiol and the reduced thiol groups of the nearby lipoate. However, based on the crystal structure (PDB:1Y8N), the atomic distance between the C78 thiol group and the thiol groups from the lipoylated lysine is ~32 Å, which might be too far for coordinating arsenic (Figure 5—figure supplement 1) (Kato et al., 2005). Alternatively, arsenic trioxide could inhibit DBT-1(C78) through coordination between the thiol groups of C78 and C65 (~8.5 Å) (Figure 5—figure supplement 1). Analogous thiol-dependent mechanisms have been proposed for the inhibition of other enzymes by arsenic (Shen et al., 2013). Despite structural similarities and a shared cofactor, no evidence in the literature indicates that BCKDH is inhibited by arsenic trioxide, so these results demonstrate the first connection of arsenic toxicity to BCKDH E2 subunit inhibition.

Multiple sequence alignments show that cysteine residues C65 and C78 of C. elegans DBT-1 correspond to residues S112 and C99 of human DBT1 (Figure 5A). Although DBT1 does not vary at position 112 within the human population, two residues (R113C and W84C) in close spatial proximity do (Forbes et al., 2008). We hypothesized that cysteine variants in DBT1 would sensitize human cells to arsenic trioxide. However, we found that the cysteine variants (W84C, S112C, and R113C) proliferated slightly more rapidly than the parental cells in the presence of arsenic. Notably, a growing body of evidence suggests that certain cancer cells upregulate components involved in BCAA metabolism, and this upregulation promotes tumor growth (Burrage et al., 2014; Tönjes et al., 2013). Perhaps, the increased proliferation of human cell lines that contain the DBT1 C112 allele (Figure 5) is caused by increased activity of the BCKDH complex. It is worth noting that human cell lines grown in culture do not have the same strict requirements for mmBCFA, and the requirements for different fatty acids are variable among diverse immortalized cell lines (Hughes-Fulford et al., 2001; Agostini et al., 2004). Furthermore, in C. elegans, the developmental defects associated with dbt-1 loss-of-function can be rescued by neuronal-specific expression of dbt-1 (Jia et al., 2016), suggesting that the physiological requirements of mmBCFA in C. elegans depend on the coordination of multiple tissues that cannot be recapitulated with cell-culture experiments. These results further highlight the complexity of arsenic toxicity, as well as the physiological requirements for BCAA within and across species and could explain the discrepancy between the physiological effects we observed in C. elegans and human cell-line experiments. Given that arsenic trioxide has become the standard-of-care for treating APL (Coombs et al., 2015) and is gaining traction in treating other leukemias, it might be important to further explore the effects of arsenic on BCAA metabolism and cancer growth.

The C78 allele of DBT-1 is likely the derived allele in the C. elegans population because all other organisms have a serine at the corresponding position. The loss of the serine allele in the C. elegans population might have been caused by relaxed selection at this locus, although this hypothesis is difficult to test because of the effects of linked selection and decreased recombination in chromosome centers. It is hypothesized that the C. elegans species originated from the Pacific Rim and that the ancestral state more closely resembles the CB5846 strain than the N2 strain (Andersen et al., 2012; Thompson et al., 2015). The CB4856 strain was isolated from the Hawaiian island of O’ahu (Hodgkin and Doniach, 1997), where environments could have elevated levels of arsenic in the soil from volcanic activity, the farming of cane sugar, former construction material (canec) production facilities, or wood treatment plants (Hawaii.gov). It is possible that as the C. elegans species spread across the globe into areas with lower levels of arsenic in the soil and water, the selective pressure to maintain high arsenic tolerance was relaxed and the cysteine allele appeared. Alternatively, higher mmBCFA levels at the L1 larval stage in strains with the DBT-1(C78) allele (Figure 4—figure supplements 34) might cause faster development in certain conditions, although we did not observe allele-specific growth differences in laboratory conditions. Despite these clues suggesting selection in local environments, the genomic region surrounding the dbt-1 locus does not show a signature of selection as measured by Tajima’s D (Tajima, 1989) (Figure 2—figure supplement 3; Figure 2—source data 3), and the strains with the DBT-1 S78 allele show no signs of geographic clustering (Figure 2—figure supplement 4; Figure 2—figure supplement 3—source data 1). Nevertheless, our study suggests that C. elegans is a powerful model to investigate the molecular mechanisms for how populations respond to environmental toxins.

Materials and methods

Key resources table
Reagent type
(species) or
resource
DesignationSource or
reference
IdentifiersAdditional
information
Gene
(Caenorhabditis elegans)
dbt-1NAWormbase:WBGene00014054
Strain, strain
background
(C. elegans)
N2 DBT-1(S78)This paperAndersen_Lab:ECA581dbt-1(ean15[C78S])
Strain, strain
background
(C. elegans)
CB4856 DBT-1(C78)This paperAndersen_Lab:ECA590dbt-1(ean34[S78C])
Strain, strain
background
(C. elegans)
Left NIL; CB4856 > N2 (II:5.75–8.02 Mb)This paperAndersen_Lab:ECA414eanIR188[II:5.75–8.02 Mb,
CB4856 > N2]
Strain, strain
background
(C. elegans)
Right NIL; CB4856 > N2 (II:7.83–9.66 Mb)This paperAndersen_Lab:ECA434eanIR208[II:7.83–9.66 Mb,
CB4856 > N2]
Sequence-based
reagent
NIL Fd primerThis paperAndersen_Lab:oECA609tttcacacaaaccatgcgct
Sequence-based
reagent
NIL Rv primerThis paperAndersen_Lab:oECA610actcgtctgctgggtattct
Sequence-based
reagent
NIL Fd primerThis paperAndersen_
Lab:oECA611
tgtcttcgcacctttactcg
Sequence-based
reagent
NIL Rv primerThis paperAndersen_
Lab:oECA612
cattcaagtcagggcgtatcc
Sequence-based
reagent
Genotype C78S EditThis paperAndersen_Lab:oECA1163GAAGGAATTGCCGAAGTTCAGGTTAAG
Sequence-based
reagent
Genotype C78S EditThis paperAndersen_Lab:oECA1165CCGTCATCTCCACAAAAAGCTTTATCTCTC
Sequence-based
reagent
dbt-1 gRNAThis paperAndersen_
Lab:crECA97
CCATCTCCTGTAGATACGAC
Sequence-based reagentN2 dbt-1 repair oligoThis paperAndersen_
Lab:oECA1542
CTTCCAGGTACGTGAAAGAAGGAGATACGATTTCGCAGTTCGATAAAGTCTGTGAAGTGCAAAGTGATAAAGCAGCAGTAACCATCTCCAGTAGATACGACGGAATTGTCAAAAAATTGTAAGTTTCTTCCTAA
Sequence-based
reagent
CB4856 dbt-1 repair oligoThis paperAndersen_
Lab:oECA1543
TTAGGAAGAAACTTACAATT
TTTTGACAATTCCGTCGTATCTACAGGAGATGGTTACTGCT
GCTTTATCGCTTTGCACTTCACAGACTTTATCGAACTGCGAAATCGTATCTCCTTCTTTCACGTACCTGGAAG
Sequence-based
reagent
dpy-10 repair oligoKim et al., 2014Andersen_Lab:crECA37CACTTGAACTTCAATACGGCAAGATGAGAATGACTGGAAACCGTACCGCATGCGGTGCCTATGGTAGCGGAGCTTCACATGGCTTCAGACCAACAGCCTAT
Sequence-based
reagent
dpy-10 gRNAKim et al., 2014Andersen_Lab:crECA36GCTACCATAGGCACCACGAG
Sequence-based
reagent
Human gRNA S112C and R113CThis paperGuide_1 used in RDA_74TCCATCATAACGACTAGTGA
Sequence-based
reagent
S112C repair templateThis paper1192 DBT1-repair-S112CATAGCATCTGTGAAGTTCAAAGTGATAAAGCTTCTGTTACAATCACTTGTCGTTATGATGGAGTCATTAAAAAACTCTATT
Sequence-based
reagent
R113C repair templateThis paper1193 DBT1-repair-R113CATAGCATCTGTGAAGTTCAAAGTGATAAAGCTTCTGTTACAATCACTAGTTGTTATGATGGAGTCATTAAAAAACTCTATT
Sequence-based reagentFwd PCR CThis paper1188 DBT1-PCR-CTtgtggaaaggacgaaacaccgAGAAGGAGATACAGTGTCTCAGT
Sequence-based
reagent
Fwd PCR DThis paper1189 DBT1-PCR-DTtgtggaaaggacgaaacaccgTGTCTCAGTTTGATAGCATCTGTG
Sequence-based
reagent
Human gRNA W84CThis paperGuide_2 used in RDA_75TCTTTTAGGTATGTAAAAGA
Sequence-based
reagent
W84C repair templateThis paper1195 DBT1-repair-W84C-v2GACTGTTTCCATAAAAGTGTCTCATTTCTTTTTCTTTTAGTTATGTGAAGGAAGGAGATACAGTGTCTCAGTTTGATAGCAT
Sequence-based
reagent
Fwd PCR AThis paper1186 DBT1-PCR-ATtgtggaaaggacgaaacaccgGCATGGCATTTACATCCTTAATATGAT
Sequence-based
reagent
Fwd PCR BThis paper1187 DBT1-PCR-BTtgtggaaaggacgaaacaccgCCTTAATATGATCTGTACTTATGACTGTTT
Sequence-based
reagent
Rev PCR 1This paper1190 DBT1-PCR-Rev1TctactattctttcccctgcactgtCTACTAATGGCTTCCCCACAT
Sequence-based
reagent
Rev PCR 2This paper1191 DBT1-PCR-Rev2TctactattctttcccctgcactgtCAATACCTTTTAAAGCTTCCGTTTCTAT
Transfected
construct
(Homo Sapiens)
S112C and R113C Cas9-sgRNA plasmidThis paperp1054
Transfected
construct
(Homo Sapiens)
W84C Cas9-sgRNA plasmidThis paperp1052

Strains

Animals were fed the bacterial strain OP50 and grown at 20°C on modified nematode growth medium (NGM), containing 1% agar and 0.7% agarose to prevent burrowing of the wild isolates (Boyd et al., 2012). For each assay, strains were grown for five generations with no strain entering starvation or encountering dauer-inducing conditions (Andersen et al., 2014). Wild C. elegans isolates used for genome-wide association and recombinant inbred advanced intercross lines (RIAILs) used for linkage mapping have been described previously (Cook et al., 2016; Cook et al., 2017; Andersen et al., 2015). Strains constructed for this manuscript are listed above in the Key Resources Table.

High-throughput fitness assay

We used the high-throughput fitness assay (HTA) described previously (Andersen et al., 2015). In short, strains are passaged for four generations before bleach-synchronization and aliquoted to 96-well microtiter plates at approximately one embryo per microliter in K medium (Boyd et al., 2012). The final concentration of NaCl in the K medium for the genome-wide association (GWA) and linkage mapping assays was 51 mM. For all subsequent experiments the final NaCl concentration was 10.2 mM. The following day, hatched and synchronized L1 animals were fed HB101 bacterial lysate (Pennsylvania State University Shared Fermentation Facility, State College, PA, [García-González et al., 2017]) at a final concentration of 5 mg/ml and grown for two days at 20°C. Next, three L4 larvae were sorted using a large-particle flow cytometer (COPAS BIOSORT, Union Biometrica, Holliston, MA) into microtiter plates that contain HB101 lysate at 10 mg/ml, K medium, 31.25 µM kanamycin, and either arsenic trioxide dissolved in 1% water or 1% water alone. The animals were then grown for four days at 20°C. For linkage mapping and GWA mapping experiments, we added polychromatic fluorescent beads (Polysciences, cat. #19507–5) to each well for five minutes. The populations were treated with sodium azide (50 mM) prior to being measured with the BIOSORT. To reduce experimental costs, the polychromatic fluorescent beads were not added to follow-up experiments. For all experiments, we report the results for four independently quantified traits): the normalized brood size (norm.n), mean progeny length per well (mean.TOF), the mean optical density normalized by animal length per well (mean.norm.EXT), and the mean fluorescence normalized by animal length per well (mean.norm.yellow). All raw experimental data can be found on FigShare (https://doi.org/10.6084/m9.figshare.7458980.v2).

Calculation of fitness traits for genetic mappings

Phenotype data generated using the BIOSORT were processed using the R package easysorter, which was specifically developed for processing this type of data (Shimko and Andersen, 2014). Briefly, the function read_data, reads in raw phenotype data and runs a support vector machine to identify and eliminate bubbles. Next, the remove_contamination function eliminates any wells that were identified as contaminated prior to scoring population parameters. This analysis results in processed BIOSORT data where each observation is for a given strain corresponds to the measurements for an individual animal. However, the phenotypes we used for mapping and follow-up experiments are summarized statistics of populations of animals in each well of a 96-well plate. The sumplate function was used to generate summary statistics of the measured parameters for each animal in each well. These summary statistics include the 10th, 25th, 50th, 75th, and 90th quantiles for time of flight (TOF), animal extinction (EXT), and three fluorescence channels (Green, Yellow, and Red), which correspond to animal length, optical density, and ability to pump fluorescent beads, respectively. Measured brood sizes (n) are normalized by the number of animals that were originally sorted into each well (norm.n). For mapping experiments, a single well replicate for each strain is summarized using the sumplate function. For follow-up experiments, multiple replicates for each strain indicated by a unique plate, well, and column were used. After summary statistics for each well are calculated, we accounted for differences between assays using the regress(assay = TRUE) function in the easysorter package. Outliers in the GWA and linkage mapping experiments were identified and eliminated using the bamf_prune function in easysorter. For follow-up experiments that contained multiple replicates for each strain, we eliminated strain replicates that were more than two standard deviations from the strain mean for each condition tested. Finally, arsenic-specific effects were calculated using the regress(assay = FALSE) function from easysorter, which accounts for strain-specific differences in growth parameters present in control conditions.

Principal component analysis of processed BIOSORT measured traits

The COPAS BIOSORT measures individual animal length (TOF), optical density (EXT), fluorescence (green, yellow, and red). We use these data to calculate the total number of animals in a well and then normalize by the number of animals initially sorted into the well (brood size). All these measurements were then summarized using the easysorter package to generate various summary statistics of each measured parameter, including five distribution quantiles and measures of dispersion (Andersen et al., 2015). We used four independently quantified traits as inputs to principal component analysis (PCA): the normalized brood size (norm.n), mean progeny length per well (mean.TOF), the mean optical density normalized by animal length per well (mean.norm.EXT), and the mean fluorescence normalized by animal length per well (mean.norm.yellow). Although we only used fluorescent beads in the GWA and linkage mapping experiments, we found that fluorescence-based traits exhibited an arsenic-specific effect that correlated with strain sickness. Prior to principal component analysis (PCA), HTA phenotypes were scaled to have a mean of zero and a standard deviation of one using the scale function in R. PCA was performed using the prcomp function in R (R Development Core Team, 2017). Eigenvectors were subsequently extracted from the object returned by the prcomp function.

Arsenic trioxide dose-response assays

All dose-response experiments were performed on four genetically diverged strains (N2, CB4856, DL238, and JU775) in technical quadruplicates prior to performing GWA and linkage mapping experiments (Figure 1—source data 1). Animals were assayed using the HTA, and phenotypic analyses were performed as described above. The arsenic trioxide concentration for GWA and linkage mapping experiments was chosen based on an observable effect for animal length and brood size phenotypes in the presence of arsenic.

Heritability calculations

For dose-response experiments, broad-sense heritability (H2) estimates were calculated using the lmer function in the lme4 package with the following linear mixed model (phenotype ~1 + (1|strain)) (Bates et al., 2014). H2 was then calculated as the fraction of the total variance that can be explained by the random component (strain) of the mixed model. Prior to estimating H2, we removed outlier replicates that we defined as replicates with values greater than two standard deviations away from the mean phenotype. Outliers were defined on a per-trait and per-strain basis. Heritability estimates for dose response experiments are in Figure 1—figure supplement 3—source data 1.

Heritability estimates for the linkage mapping experiment were calculated using two approaches. In both approaches, we used the previously described RIAIL genotype matrix to compute relatedness matrices (Andersen et al., 2012). In the first approach, a variance component model using the R package regress was used to estimate the fraction of phenotypic variation explained by additive and epistatic genetic factors, H2, or just additive genetic factors, h2 (Bloom et al., 2015; David Clifford, 2006), using the formula (y ~ 1, ~ZA+ZAA), where y is a vector of RIAIL phenotypes, ZA is the additive relatedness matrix, and ZAA is the pairwise-interaction relatedness matrix. The additive relatedness matrix was calculated as the correlation of marker genotypes between each pair of strains. In addition, a two-component variance model was calculated with both an additive and pairwise-interaction effect. The pairwise-interaction relatedness matrix was calculated as the Hadamard product of the additive relatedness matrix.

The second approach utilized a linear mixed model and the realized additive and epistatic relatedness matrices (Endelman, 2011; Covarrubias-Pazaran, 2016; Su et al., 2012; Endelman and Jannink, 2012). We used the mmer function in the sommer package with the formula (y ~ A + E) to estimate variance components, where y is a vector of RIAIL phenotypes, A is the realized additive relatedness matrix, and E is the epistatic relatedness matrix. This same approach was used to estimate heritability for the GWA mapping phenotype data, with the only difference being that we used the wild isolate genotype matrix described below. Heritability estimates for RIAIL and wild isolate data are in Figure 1—figure supplement 6—source data 1 and Figure 2—figure supplement 5—source data 1, respectively.

Effect size calculations for dose response assay

We first fit a linear model with the formula (phenotype ~strain) for all measured and principal component traits for each concentration of arsenic trioxide using the lm R function. Next, we extracted effect sizes using the anova_stats function from the sjstats R package (Lüdecke, 2018). Effect sizes for dose responses are in Figure 1—figure supplement 3—source data 2.

Linkage mapping

A total of 262 RIAILs were phenotyped in the HTA described previously for control and arsenic trioxide conditions (Andersen et al., 2015; Zdraljevic et al., 2017). The phenotype and genotype data were entered into R and scaled to have a mean of zero and a variance of one for linkage analysis (Figure 1—source data 2). Quantitative trait loci (QTL) were detected by calculating logarithm of odds (LOD) scores for each marker and each trait as -n(ln(1-r2)/2ln(10)), where r is the Pearson correlation coefficient between RIAIL genotypes at the marker and phenotype trait values (Bloom et al., 2013). The maximum LOD score for each chromosome for each trait was retained from three iterations of linkage mappings (Figure 1—source data 3). We randomly permuted the phenotype values of each RIAIL while maintaining correlation structure among phenotypes 1000 times to estimate the significance threshold empirically. The significance threshold was set using a genome-wide error rate of 5%. Confidence intervals were defined as the regions contained within a 1.5 LOD drop from the maximum LOD score (Broman et al., 2003).

Near-isogenic line (NIL) generation

NILs were generated by crossing N2xCB4856 RIAILs to each parental genotype. For each NIL, eight crosses were performed followed by six generations of selfing to homozygose the genome. Reagents used to generate NILs are detailed in the Key Resources Table. The NILs responses to 1000 µM arsenic trioxide were quantified using the HTA described above (Figure 1—source data 4). NIL whole-genome sequencing and analysis was performed as described previously (Brady et al., 2018) (Figure 1—source data 5).

Genome-wide association mapping

Genome-wide association (GWA) mapping was performed using phenotype data from 86 C. elegans isotypes (Figure 2—source data 2). Genotype data were acquired from the latest VCF release (Release 20180527) from CeNDR that was imputed as described previously (Cook et al., 2017). We used BCFtools (Li, 2011) to filter variants that had any missing genotype calls and variants that were below 5% minor allele frequency. We used PLINK v1.9 (Purcell et al., 2007; Chang et al., 2015) to LD-prune the genotypes at a threshold of r2 <0.8, using --indep-pairwise 50 10 0.8. This resulting genotype set consisted of 59,241 markers that were used to generate the realized additive kinship matrix using the A.mat function in the rrBLUP R package (Endelman, 2011) (Figure 2—source data 3). These markers were also used for genome-wide mapping. However, because these markers still have substantial LD within this genotype set, we performed eigen decomposition of the correlation matrix of the genotype matrix using eigs_sym function in Rspectra package (Qiu, 2018). The correlation matrix was generated using the cor function in the correlateR R package (Bilgrau, 2018). We set any eigenvalue greater than one from this analysis to one and summed all of the resulting eigenvalues (Li and Ji, 2005). This number was 500.761, which corresponds to the number of independent tests within the genotype matrix. We used the GWAS function in the rrBLUP package to perform genome-wide mapping with the following command: rrBLUP::GWAS (pheno = PC1, geno = Pruned_Markers, K = KINSHIP, min.MAF = 0.05, n.core = 1, P3D = FALSE, plot = FALSE). To perform fine-mapping, we defined confidence intervals from the genome-wide mapping as +/- 100 SNVs from the rightmost and leftmost markers above the Bonferroni significance threshold. We then generated a QTL region of interest genotype matrix that was filtered as described above, with the one exception that we did not perform LD pruning. We used PLINK v1.9 to extract the LD between the markers used for fine mapping and the peak QTL marker identified from the genome-wide scan. We used the same command as above to perform fine mapping, but with the reduced variant set. The workflow for performing GWA mapping can be found on https://github.com/AndersenLab/cegwas2-nf (copy archived at https://github.com/elifesciences-publications/cegwas2-nfZdraljevic et al., 2019). All trait mapping results can be found on FigShare (https://doi.org/10.6084/m9.figshare.7828706.v1).

Generation of dbt-1 allele replacement strains

Allele replacement strains were generated using CRISPR-Cas9-mediated genome editing, using the co-CRISPR approach (Kim et al., 2014) with Cas9 ribonucleoprotein delivery (Paix et al., 2015). Alt-R crRNA and tracrRNA were purchased from IDT (Skokie, IL). tracrRNA (IDT, 1072532) was injected at a concentration of 13.6 µM. The dpy-10 and the dbt-1 crRNAs were injected at 4 µM and 9.6 µM, respectively. The dpy-10 and the dbt-1 single-stranded oligodeoxynucleotides (ssODN) repair templates were injected at 1.34 µM and 4 µM, respectively. Cas9 protein (IDT, 1074182) was injected at 23 µM. To generate injection mixes, the tracrRNA and crRNAs were incubated at 95°C for 5 min and 10°C for 10 min. Next, Cas9 protein was added and incubated for 5 min at room temperature. Finally, repair templates and nuclease-free water were added to the mixtures and loaded into pulled injection needles (1B100F-4, World Precision Instruments, Sarasota, FL). Individual injected P0 animals were transferred to new 6 cm NGM plates approximately 18 hr after injections. Individual F1 rollers were then transferred to new 6 cm plates to generate self-progeny. The region surrounding the desired S78C (or C78S) edit was then amplified from F1 rollers using primers oECA1163 and oECA1165. The PCR products were digested using the SfcI restriction enzyme (R0561S, New England Biolabs, Ipswich, MA). Differential band patterns signified successfully edited strains because the N2 S78C, which is encoded by the CAG codon, creates an additional SfcI cut site. Non-Dpy, non-Rol progeny from homozygous edited F1 animals were propagated. If no homozygous edits were obtained, heterozygous F1 progeny were propagated and screened for the presence of the homozygous edits. F1 and F2 progeny were then Sanger sequenced to verify the presence of the proper edited sequence. The phenotypes of allele replacement strains in control and arsenic trioxide conditions were measured using the HTA described above (Figure 1—source data 4). PCA phenotypes for allele-replacement strains were generated the same way as described above for GWA mapping traits and are located in Figure 1—source data 4.

Rescue with 13-methyltetradecanoic acid

Strains were grown as described for a standard HTA experiment. In addition to adding arsenic trioixde to experimental wells, we also added a range of C15iso (13-methyltetradecanoic acid, Matreya Catalog # 1605) concentrations to assay rescue of arsenic effects (Figure 4—source data 4).

Growth conditions for metabolite profiling

For L1 larval stage assays, chunks (~1 cm) were taken from starved plates and placed on multiple fresh 10 cm plates. Prior to starvation, animals were washed off of the plates using M9, and embryos were prepared by bleach synchronization. Approximately 40,000 embryos were resuspended in 25 ml of K medium and allowed to hatch overnight at 20°C. L1 larvae were fed 15 mg/ml of HB101 lysate the following morning and allowed to grow at 20°C for 72 hr. We harvested 100,000 embryos from gravid adults by bleaching. These embryos were hatched overnight in 50 ml of K medium in a 125 ml flask. The following day, we added arsenic trioxide to a final concentration of 100 µM and incubated the cultures for 24 hr. After 24 hr, we added HB101 bacterial lysate (2 mg/ml) to each culture. Finally, we transferred the cultures to 50 ml conical tubes, centrifuged the cultures at 3000 RPM for 3 min to separate the pellet and supernatant. The supernatant and pellets from the cultures were frozen at −80°C and prepared for analysis. For young adult stage assays, 45,000 animals per culture were prepared as described above but in S medium, at a density of three animals per microliter, and fed HB101 lysate (5 mg/mL). These cultures were shaken at 200 RPM, 20°C in 50 mL Erlenmeyer flasks for 62 hr. For harvesting, we settled 15 mL of cultures for 15 min at room temperature and then pipetted the top 12 mL of solution off of the culture. The remaining 3 mL of animal pellet was washed with 10 mL of M9, centrifuged at 1000 g for one minute, and then the supernatant removed. This wash was repeated once more with M9 and again with water. The final nematode pellet was snap frozen in liquid nitrogen.

Nematode metabolite extractions

Pellets were lyophilized 18–24 hr using a VirTis BenchTop 4K Freeze Dryer until a chalky consistency was achieved. Dried pellets were transferred to 1.5 mL microfuge tubes and dry pellet weight recorded. Pellets were disrupted in a Spex 1600 MiniG tissue grinder after the addition of three stainless steel grinding balls to each sample. Microfuge tubes were placed in a Cryoblock (Model 1660) cooled in liquid nitrogen, and samples were disrupted at 1100 RPM for two cycles of 30 s. Each sample was individually dragged across a microfuge tube rack eight times, inverted, and flicked five times to prevent clumping. This process was repeated two additional rounds for a total of six disruptions. Pellets were transferred to 4 mL glass vials in 3 mL 100% ethanol. Samples were sonicated for 20 min (on/off pulse cycles of two seconds at power 90 A) using a Qsonica Ultrasonic Processor (Model Q700) with a water bath cup horn adaptor (Model 431C2). Following sonication, glass vials were centrifuged at 2750 RCF for five minutes in an Eppendorf 5702 Centrifuge using rotor F-35-30-17. The resulting supernatant was transferred to a clean 4 mL glass vial and concentrated to dryness in an SC250EXP Speedvac Concentrator coupled to an RVT5105 Refrigerated Vapor Trap (Thermo Scientific). The resulting powder was suspended in 100% ethanol according to its original dry pellet weight: 0.01 mL 100% ethanol per mg of material. The suspension was sonicated for 10 min (pulse cycles of 2 s on and 3 s off at power 90 A) followed by centrifugation at 20,817 RCF in a refrigerated Eppendorf centrifuge 5417R at 4°C. The resulting supernatant was transferred to an HPLC vial containing a Phenomenex insert (cat #AR0-4521-12) and centrifuged at 2750 RCF for five minutes in an Eppendorf 5702 centrifuge. The resulting supernatant was transferred to a clean HPLC vial insert and stored at −20°C or analyzed immediately.

Mass spectrometric analysis

Reversed-phase chromatography was performed using a Dionex Ultimate 3000 Series LC system (HPG-3400 RS High Pressure pump, TCC-3000RS column compartment, WPS-3000TRS autosampler, DAD-3000 Diode Array Detector) controlled by Chromeleon Software (ThermoFisher Scientific) and coupled to an Orbitrap Q-Exactive mass spectrometer controlled by Xcalibur software (ThermoFisher Scientific). Metabolites were separated on a Kinetex EVO C18 column, 150 mm x 2.1 mm, particle size 1.7 µm, maintained at 40°C with a flow rate of 0.5 mL/min. Solvent A: 0.1% ammonium acetate in water; solvent B: acetonitrile (ACN). A/B gradient started at 5% B for 30 s, followed by a linear gradient to 95% B over 13.5 min, then a linear gradient to 100% B over 3 min. 100% B was maintained for 1 min. Column was washed after each run with 5:1 isopropanol:ACN, flow rate of 0.12 mL/min for 5 min, followed by 100% ACN for 2.9 min, a linear gradient to 95:5 water:ACN over 0.1 min, and then 95:5 water:ACN for 2 min with a flow rate of 0.5 mL/min. A heated electrospray ionization source (HESI-II) was used for the ionization with the following mass spectrometer parameters: spray voltage: 3 kV; capillary temperature: 320°C; probe heater temperature: 300°C; sheath gas: 70 AU; auxiliary gas flow: 2 AU; resolution: 240,000 FWHM at m/z 200; AGC target: 5e6; maximum injection time: 300 ms. Each sample was analyzed in negative and positive modes with m/z range 200–800. Fatty acids and most ascarosides were detected as [M-H]- ions in negative ionization mode. Peaks of known abundant ascarosides and fatty acids were used to monitor mass accuracy, chromatographic peak shape, and instrument sensitivity for each sample. Processed metabolite measures can be found in Figure 4—source datas 13 and Figure 4—figure supplement 4—source datas 1 (Artyukhin et al., 2018).

Statistical analyses

All p-values testing the differences of strain phenotypes in the NIL, allele-replacement, and C15ISO experiments were performed in R using the TukeyHSD function with an ANOVA model with the formula (phenotype ~strain). p-Values of individual pairwise strain comparisons are reported in each figure legend.

CRISPR-Cas9 gene editing in human cells

The 293 T cells were sourced from CCLE. Identity authenticated by SNP profiling. Cells were regularly tested for Mycoplasma (~bimonthly). Gene-editing experiments were performed in a single parallel culture experiment using human 293 T cells (ATCC) grown in DMEM with 10% FBS. On day zero, 300,000 cells were seeded per well in a six-well plate format. The following day, two master mixes were prepared: a) LT-1 transfection reagent (Mirus) was diluted 1:10 in Opti-MEM and incubated for 5 min; b) a DNA mix of 500 ng Cas9-sgRNA plasmid (Supplementary file 12) with 250 pmol repair template oligonucleotide was diluted in Opti-MEM in a final volume of 100 μL. 250 μL of the lipid mix was added to each of the DNA mixes and incubated at room temperature for 25 min. Following incubation, the full 350 μL volume of DNA and lipid mix was added dropwise to the cells. These six-well plates were then centrifuged at 1000 x g for 30 min. After 6 hr, the media on the cells was replaced. For the next 6 days, cells were expanded and passaged as needed. On day 7, one million cells were taken from each set of edited and unedited cells and placed into separate T75s with either media-only or 5 µM arsenic-containing media. Days 7 to 14, arsenic and media-only conditions were maintained at healthy cell densities. Days 14 to 18, arsenic exposed cell populations were maintained off arsenic to allow the populations to recover prior to sequencing. Media-only conditions were maintained in parallel. On day 18, all arsenic and media-only conditions were pelleted for genomic DNA extraction.

Analysis of CRISPR-Cas9 editing in human cells

Genomic DNA was extracted from cell pellets using the QIAGEN (QIAGEN, Hilden, Germany) Midi or Mini Kits based on the size of the cell pellet (51183, 51104) according to the manufacturer’s recommendations. DBT1 loci were first amplified with 17 cycles of PCR using a touchdown protocol and the NEBnext 2x master mix (New England Biolabs M0541). The resulting product served as input to a second PCR, using primers that appended a sample-specific barcode and the necessary adaptors for Illumina sequencing. The resulting DNA was pooled, purified with SPRI beads (A63880, Beckman Coulter, Brea, CA), and sequenced on an Illumina MiSeq with a 300-nucleotide single-end read with an eight nucleotide index read. For each sample, the number of reads exactly matching the wild-type and edited DBT1 sequence were determined (Figure 5—source data 1).

Preparing human cells for mass spectroscopy

Mass spectroscopy experiments used human 293 T cells (ATCC) grown in DMEM with 10% FBS. On day zero, 150,000 cells were seeded into 15 cm tissue cultures dishes with 15 mL of either 2.5 µM arsenic or no arsenic media. Each condition had five replicates. On day 3, the no arsenic cells were approaching confluence and required passaging. Arsenic conditions were at ~30% confluence and received a media change. On day seven, both conditions were near confluence, media was removed, and plates were rinsed with ice cold PBS, remaining liquid removed. Plates were frozen at −80°C before processing for mass spectrometric analysis. Cells were scraped off the plates with PBS and pelleted in microfuge tubes. Cell pellets were lyophilized 18–24 hr using a VirTis BenchTop 4K Freeze Dryer and extracted in 100% ethanol using the same sonication program as described for nematode extraction. Following sonication, samples were centrifuged at 20,817 RCF in a refrigerated Eppendorf centrifuge 5417R at 4°C. Clarified supernatant was aliquoted to a new tube and concentrated to dryness in an SC250EXP Speedvac Concentrator coupled to an RVT5105 Refrigerated Vapor Trap (Thermo Scientific). The resulting material was suspended in. 1 mL 100% ethanol and analyzed by LC-MS as described. Metabolite measurements can be found in Figure 5—source data 3.

Tajima’s D calculation

We used the VCF corresponding to CeNDR release 20160408 (https://elegansvariation.org/data/release/20160408) to calculate Tajima’s D. Tajima’s D was calculated using the tajimas_d function in the cegwas package using default parameters (window size = 500 SNVs, sliding window distance = 50 SNVs, outgroup = N2) (Figure 2—source data 3). Isolation locations of strains can be found in Figure 2—figure supplement 3—source data 1.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
    correlateR
    1. AE Bilgrau
    (2018)
    Github.
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
    Use of arsenic trioxide (As2O3) in the treatment of acute promyelocytic leukemia (APL): I. As2O3 exerts dose-dependent dual effects on APL cells
    1. GQ Chen
    2. XG Shi
    3. W Tang
    4. SM Xiong
    5. J Zhu
    6. X Cai
    7. ZG Han
    8. JH Ni
    9. GY Shi
    10. PM Jia
    11. MM Liu
    12. KL He
    13. C Niu
    14. J Ma
    15. P Zhang
    16. TD Zhang
    17. P Paul
    18. T Naoe
    19. K Kitamura
    20. W Miller
    21. S Waxman
    22. ZY Wang
    23. H de The
    24. SJ Chen
    25. Z Chen
    (1997)
    Blood 89:3345–3353.
  20. 20
  21. 21
  22. 22
  23. 23
    CeNDR, the Caenorhabditis elegans natural diversity resource
    1. DE Cook
    2. S Zdraljevic
    3. JP Roberts
    4. EC Andersen
    (2017)
    Nucleic Acids Research, 45, 10.1093/nar/gkw893, 27701074.
  24. 24
  25. 25
  26. 26
    The regress function
    1. PM David Clifford
    (2006)
    R News pp. 6–10.
  27. 27
  28. 28
    Chemotherapy for Leukemia
    1. N Emi
    (2017)
    221–238, Arsenic Trioxide: Clinical Pharmacology and Therapeutic Results, Chemotherapy for Leukemia, Singapore, Springer, 10.1007/978-981-10-3332-2_13.
  29. 29
  30. 30
  31. 31
  32. 32
    The catalogue of somatic mutations in cancer (COSMIC)
    1. SA Forbes
    2. G Bhamra
    3. S Bamford
    4. E Dawson
    5. C Kok
    6. J Clements
    (2008)
    Current Protocols in Human Genetics, 11.
  33. 33
  34. 34
  35. 35
  36. 36
    PML/RAR alpha fusion protein expression in normal human hematopoietic progenitors dictates myeloid commitment and the promyelocytic phenotype
    1. F Grignani
    2. M Valtieri
    3. M Gabbianelli
    4. V Gelmetti
    5. R Botta
    6. L Luchetti
    7. B Masella
    8. O Morsilli
    9. E Pelosi
    10. P Samoggia
    11. PG Pelicci
    12. C Peschle
    (2000)
    Blood 96:1531–1537.
  37. 37
  38. 38
    Natural variation and copulatory plug formation in Caenorhabditis elegans
    1. J Hodgkin
    2. T Doniach
    (1997)
    Genetics 146:149–164.
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
    RSpectra
    1. Y Qiu
    (2018)
    Github.
  62. 62
    R: a language and environment for statistical computing
    1. R Development Core Team
    (2017)
    R: a language and environment for statistical computing, Vienna, Austria, http://www.r-project.org/.
  63. 63
  64. 64
    Arsenic Pollution: A Global Synthesis
    1. P Ravenscroft
    2. H Brammer
    3. K Richards
    (2009)
    Chichester: Wiley-Blackwell.
  65. 65
    Structure-function relationships in dihydrolipoamide acyltransferases
    1. LJ Reed
    2. ML Hackert
    (1990)
    The Journal of Biological Chemistry 265:8971–8974.
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
    Statistical method for testing the neutral mutation hypothesis by DNA polymorphism
    1. F Tajima
    (1989)
    Genetics 123:585–595.
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87

Decision letter

  1. Piali Sengupta
    Reviewing Editor; Brandeis University, United States
  2. Michael A Marletta
    Senior Editor; University of California, Berkeley, United States
  3. Meng C Wang
    Reviewer; Baylor College of Medicine, HHMI, United States

In the interests of transparency, eLife includes the editorial decision letter, peer reviews, and accompanying author responses.

[Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed.]

Thank you for submitting your article "Natural variation in arsenic toxicity is explained by differences in branched chain amino acid catabolism" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Michael Marletta as the Senior Editor. The following individual involved in review of your submission has agreed to reveal her identity: Meng C Wang (Reviewer #2).

The Reviewing Editor has highlighted the concerns that require revision and/or responses, and we have included the separate reviews below for your consideration. If you have any questions, please do not hesitate to contact us

Summary:

As you will see from the reviews, both reviewers appreciated the breadth of the work and the potential overall interest. The major strengths of the work are:

a) The identification of a new gene (DBT-1) in regulating arsenic toxicity in both C. elegans and mammalian cells.

b) The comprehensive experimental approach starting from analyses of natural variation in C. elegans strains to identification of a genetic contributor to this variation, followed by biochemical characterization of the encoded enzyme.

However, both reviewers also raised a number of issues that are summarized in brief below.

Major concerns:

The major concern is that it is unclear from the presented data to what extent variation in dbt-1 contributes to the observed genetic variation in arsenic toxicity. The methods and analyses (particularly for the GWAS) are either not presented or not presented in sufficient detail. Raw data in some cases are also difficult to interpret. For instance, there are no analyses presented of the contribution of between and among-strain variance to the total variance. The PCA approach as presented is flawed as outlined by reviewer 1. There is also a concern regarding the methodology used for QTL mapping and the lack of estimates of QTL sizes.

The recommendation for the above is to include raw and re-analyzed data along with detailed statistical analyses and discussion to address the concerns raised about the interpretation of the quantitative genetics experiments.

Reviewer 2 raises an experimental concern regarding the induction of BCAA levels by the DBT-1 cysteine variant in C. elegans and 293 cells that should be addressed.

Minor concerns:

The authors may also wish to take the comments from reviewer 1 into consideration regarding the scholarliness of the presentation.

Reviewer 2 raises a minor issue regarding consistency in the naming of proteins across the manuscript.

The biological system should be indicated in the title and/or Abstract:

The title and/or Abstract should provide a clear indication of the biological system under investigation (i.e., species name or broader taxonomic group, if appropriate). Please revise your title and/or Abstract with this advice in mind.

Separate reviews (please respond to each point):

Reviewer #1:

This paper provides an unusually broad and comprehensive approach of moving from natural variation for arsenic response in C. elegans to genetic mapping to gene identification to biochemical characterization. Overall, the topic is very well introduced and motivated. In some ways, the scope is almost too large for any single reviewer to assess. For this purpose, I will therefore concentrate on the genetics side of the work and let others assess the biochemistry and health implications (which does feel a bit cloying at times).

The authors use a four-fold strategy to identify the arsenic resistance locus: QTL mapping, NIL construction, GWAS, and genetic transformation via CRISPR. There can be little doubt that they have identified a gene that generates increased arsenic resistance the CB4856 line, as the allele-specific replacements via CRISPR would be the gold standard for this. On this basis, the fundamental results and conclusions of the paper appear to be on a solid foundation, providing an appropriate context for the biochemical characterization that forms the second half of the paper.

The primary concerns that I have with the work relate to the lead in to mapping results, especially around the GWAS results and how they are portrayed. While the paper definitely appears to identify "a gene" for arsenic resistance, we are given very little context for evaluating how important this gene is in the context of genetic variation for arsenic resistance per se. Instead, the mapping results are used to move directly to proposed significance in human health, but we do not really know what the relevance to the worm themselves are in the first place.

This paper clearly approaches this problem from a quantitative genetic perspective, and yet we are provided very little quantitative genetic context in which to evaluate the mapping approach, especially for the GWAS. Specifically, there is no presentation of a partitioning of within and among line variance, which is necessary to understand what kind of expectations one might have in terms of mapping outcomes, especially with respect to power. In the "transparent reporting" document that accompanies the manuscript, it is stated that "Replicates and power were determined by the throughput required for the particular assay or variance explained." First, it is not even clear what this statement means other than waving off the reporting standard being imposed by eLife. More importantly, however, this simply cannot be the case for the GWAS-or at least it cannot be evaluated in the absence of some sort of analysis of the quantitative genetics of the traits involved.

To attempt to address this concern, at a fairly high degree of difficulty, I worked to figure out how the data for this experiment actually worked. While we are at a brave new world of open data, which is a good thing, there could be some debate as to whether the data standard should be to simply make all raw data available or if there should be some effort to curate the data such that an external reader is told what the columns mean (e.g., what is the role of stage in the statistical outcomes), which are critical for evaluation of the results (e.g., temporal blocking), etc. I note that the Materials and methods section on "statistical analysis" outlines the most trivial model possible, so I assume that is the approach used throughout the paper. In the absence of this information, I simply tried to use a GLM analysis of the GWAS population samples to estimate (via ML) the within and among line variances across environmental treatments. I found that the among-strain variance accounted for about 1-2% of the total variance observed among individuals, while the condition-by-strain variance explained about 2-10%. Now this is for tens of thousands of observations, so these estimates are for the most part quite significant. However, it is the total among-strain (or strain-by-condition) variance that will determine the power for identifying any particular marker as significant as opposed to the overall heritability. I have little doubt that my quick and dirty look at this is flawed. However, I would expect to see something like this in the paper itself, at least as a supplementary table. Lack of this analysis also does not allow the authors to say anything about effect sizes, which are also notably absent from the paper.

It is perhaps then not surprising that no results turned up for the single-trait analysis. The authors therefore turn to a PCA approach, which itself features some fairly unusual aspects in terms of presentation. There appear to be five traits measured on each individual, yet the PCA is based on 67 measurements ("traits"). These are apparently largely summary statistics for individuals from each strain. The PCA must therefore have been generated via the among-line statistics in the absence of within-line variation, which is fine. What seems less fine is that the "traits" used here are actually all derivatives of the original five traits and are mostly descriptions of the within-line distributions. This means that many of the measures are functions of one another (e.g., the mean and CV). Indeed, I am surprised that the PCA works at all, as the correlation matrix for this set of traits should be singular. As importantly, the new PCA "traits" that emerge from such an analysis essentially have no biological meaning. They are almost entirely statistical constructions. I cannot think of any valid reason to do this, unless variance per se is the trait of interest, in which case that should be directly addressed in a much different fashion. Unless motivated quite differently, the PCA should be conducted on just the original five traits.

It is also odd that the interpretation for the PCA results is then motivated by a regression of PCA1 on body length (Supplementary Figure 4). That is exactly what the PCA loading (eigenvector coefficient) provides in the PCA analysis. If the PCA is to be conducted, then it should be done so in the context of a full analysis, which would include reporting the factor loadings. It seems that nearly all of these features simply measure how big something is (e.g., color brightness) and that including each of these measures into a single composite parameter allows the GWAS to cross a variance threshold that provides the single significant point. The PCA itself does not completely make sense from a pure statistical point of view, and so it appears that this mapping result is the primary motivation for the approach.

A final mapping oddity is that the authors report that they use full DNA strain-specific sequencing results to calculate the relatedness matrix, but only use much older RAD marker data to do the actual mapping. The only motivation that I can see for this is to reduce the number of comparisons so that the significance threshold does not become too high. Whatever the motivation, it is not specified by the authors, nor are potential power issues addressed in the paper itself or in the "transparent reporting" document.

Taken together, the GWAS approach appears to have the hallmarks of "p-hacking." In other words, the primary goal of the analysis is to ensure that there is at least one red dot on Figure 2, such that it supports the conclusions of the QTL mapping. I am not saying that this is in fact what the authors have done; simply that one gets that impression because of the statistical gymnastics that it seems to take to get here, coupled with a realistic assessment of what is actually possible to map for these traits in these populations. This view is further supported by the singular and assertive nature of the narrative, in which limitations and alternative explanations are not entertained. I fully appreciate that this is the nature of the style of presentation in top tier journals such as eLife, in which a clear narrative coupled to a specific mechanism seems to be a requirement for publication. However, eLife's new data standards seem to be at odds with this cultural tradition. They need to figure out how they actually want to have this work. I do feel for the authors that they may be caught between an editorial expectation for clarity and the views of this specific reviewer. And I may be entirely wrong in this, which makes this specific form of publication cycle more awkward than a traditional feedback and response pattern around concerns such as this.

Similar concerns exist in the QTL mapping experiment, although the results appear to be clearer here. By my calculations the among strain variance for the RIAILs tends to be around 1-2% and the strain-by-condition variance tends to be 2-8%. Analysis of the genetic variance here is approach to the interpretation of the results, especially from the broader worm natural variation point of view, which I recognize is not a point of emphasis in the majority of the paper. Nevertheless, estimates of QTL effect sizes will help the reader interpret both these results and the interpretation of the population genetic and other functional explanations later in the paper.

The gem of the work is the allele-specific transformation that clearly demonstrates that DBT-1 has a significant effect on arsenic resistance. I do not know why these transformations were done only on the parental lines and not the NILs, although this work can be technically demanding. The latter transformations would be just a little more convincing in proving that this is actually the causal locus from the mapping experiments and not simply "a locus" that influences arsenic resistance, since there is plenty of residual difference among the parents in the transformants. However, the totality of the evidence definitely supports this interpretation, and this work is well beyond the standard in the field on this account.

I do not understand why there are no error bars on the selection experiments reported in Figure 5. Nor are there any statistical tests associated with these results that I can see. I found the raw data for these results impossible to interpret (Data File 23). Some quite important appears to be missing here, making this section impossible to interpret from a rigor and reproducibility point of view, much less in terms of scientific interpretation. I must be missing something here, so I apologize if it should be more obvious to me.

While I appreciate the desire to posit an adaptive significance on the identified alleles, we know that the central regions of C. elegans chromosomes (like the center of chromosome 2), so clear signs of global selective sweeps. It seems just, if not more, likely that this allele is simply along for the ride in a linked haplotype and does not in fact have any functional significance in natural populations. It is all well and good to suggest that more work is needed to explore the environmental circumstances that might generate selection, but it is somewhat disingenuous in this species to not discuss the potential for neutral processes in structuring this variation. As noted above, in general, caveats and alternative explanations are not a feature of this work, which tells a singular story from a very particular point of view. This does not make that view incorrect, but it does not seem to me to be the most rigorous way to present what is a truly impressive amount of work on a very difficult question.

Reviewer #2:

In this manuscript, the authors have characterized dbt-1, encoding the E2 subunit of the branched-chain α-keto dehydrogenases, as a new target for arsenic toxicity. In C. elegans, the arsenic treatment causes decreased brood size and reduced progeny length, but different wild type isolates show different sensitivity to the treatment. In particular, the CB4856 is more resistant. Using linkage mapping followed by NIL analysis and PCA-lined GWA mapping, the authors identified a single SNV correlated with the arsenic resistance that causes a cysteine to serine missense variant in the dbt-1 gene. Using CRSIPR/Cas9 genome editing, they further confirmed that this variant is responsible for the arsenic resistance. Given the role of DBT-1 in regulating mono-methyl branched chain fatty acids (mmBCFA) and the role of mmBCFA in regulating C. elegans development, the authors next compared mmBCFA levels between two variants with and without the arsenic treatment. They found that with the cysteine variant, the mmBCFA reduction is more sensitive to the arsenic treatment, and mmBCFA supplementation could rescue the arsenic sensitivity associated the cysteine variant. Furthermore, the authors also showed that the serine to cysteine alteration in the human DBT1 is associated with increased cell proliferation with the arsenic treatment. Together, this manuscript demonstrates a new link between DBT-1 and arsenic responses. There are one major and one minor issue that should be addressed before publication.

1) Major: The cysteine variant is associated with increased arsenic sensitivity in C. elegans, but with increased cell proliferation in 293 cells treated with arsenic. The authors suggest that the increased cell proliferation is due to the induction of BCAA levels. However this is not supported by the data using 293 cells. In C. elegans, the cysteine variant is associated with the induction of BCAA levels with the CB4856 background (Figure 4B). With the N2 background, the induction trend is observed but not significant different (Figure 4B). But the N2-DBT-1(S78) without arsenic has only two data sets, which might reduce the power of statistical analysis.

2) Minor: To avoid unnecessary confusion, the authors might consider changing sample labeling in Figure 4C to be consistent with other figure panels. N2 DBT-1 (S78) instead of N2 DBT-1 (C78S), CB4856 DBT-1 (C78) instead of CB4856 DBT-1 (S78C)

[Editors' note: the evaluation of the revised submission follows.]

The editors and two reviewers have re-assessed your work, "Natural variation in C. elegans arsenic toxicity is explained by differences in branched chain amino acid metabolism". The reviews follow below. As you will see, while both reviewers appreciate the scope and complexity of the analysis, neither reviewer is fully satisfied with the revisions and they continue to have major reservations. Given this state, the Reviewing Editor's assessment is that "major issues remain unresolved". If you elect to go forward with publication, this assessment will appear below the Abstract and at the top of the Decision Letter.

Reviewer 1 in particular remains unconvinced regarding the justification for the PCA. Reviewer 2 notes that it has not been demonstrated that the increased cell proliferation in HEK293 cells is due to increased BCAA levels. More details are in the reviews below.

Reviewer #1:

Thanks to the authors for addressing reviewer concerns. I would first like to start by apologizing for some of the phrasing that made it into the first review. My intention was to initiate a conversation with the other editor and other reviewers that would then be synthesized into an overall statement, as is my understanding of the eLife process. Thus the presence of comments related to editorial approach of the journal, which are irrelevant to the authors. I was trying to be very clear in my concerns and alternative interpretations for the benefit of the editorial process. So while the concerns that I raised are valid (and in many cases remain), my phrasing was more blunt that I would want to normally convey, especially given the scope and complexity of the work. It is my responsibility for not recognizing how the process was proceeding and not further editing my comments as part of the final decision process. This is obviously especially important given the public nature of this new review process.

With that preamble, I appreciate that the authors have clarified a great deal of the analysis, particularly the use of line means in the quantitative genetic analysis. The revision remains as the initial submission: a comprehensive analysis of natural variation in arsenic sensitivity. The overall conclusions are still well supported. Nevertheless, I find that the multivariate approach for the mapping analysis is not well justified. This need not be blown out of proportion, but must still be pointed out.

1) The authors have now greatly clarified the conversion of the raw observations into the mapping approach. This was a bit opaque before, but it is now clear that this is a line-means analysis, which is perfectly appropriate for mapping and there newly reported results show that there is substantial among-line variation (H2 for their traits). My original point relative to inferences within natural populations remains, however. These line-based differences actually explain very little about the susceptibility of individual worms to arsenic, which is largely due to random effects. The experimental design concerns discussed here mean that the value that I calculated are actually upper bound to the actual broad sense heritability in the population. This may or may not be important for the problem at hand, but it is important for the interpretation of these results from the standpoint of worm populations.

2) In the first version of the paper, the authors reported that they did not identify any significant peaks for the single trait analysis so they performed the mapping on the PCA. This is what was meant by "no results turned up for the single-trait analysis". The reviewer did not miss the summaries and supplementary materials. It did not mean that no results were presented, which is something quite different. The current manuscript has been modified to simply gloss over these issues and go directly to the PCA. While the authors "respectfully disagree" with criticisms of the PCA approach, they do not address the issues. The approach is statistically, mathematically, and biologically invalid. There is simply no justification for throwing summary statistics that are functions of one another into a PCA except to try to minimize variance. Showing that an arbitrary and invalid PCA is correlated with a biological measure begs the question as why one would not simply use the measure itself. Yes, it is possible to perform a PCA on a singular covariance matrix, but the outcomes tend to be highly unstable in the face of sampling variance. It is the eigenvectors that are the elements that are subject to interpretation. It is still the case that nothing about this analysis makes sense except that it yields a result that is consistent with the other, more convincing parts of the paper. It still feels that the main goal here was to derive a trait that yielded a significant result. The non-significant pileups of the other traits could have been subjected to joint probability test to achieve a similar goal in a much more straightforward and interpretable fashion.

3) The genomic approach and markers used in the GWAS is now much clearer.

4) The selection experiments in the human cell lines now contain error bars and significance tests, which is obviously a good thing. These effects are obvious not particularly large and selection among human cells seems like it could be subject to a number of complicating interpretations, but the result is well consistent with the overall results and interpretation.

Reviewer #2:

In the revised manuscript, the authors have addressed my concern on BCAA changes in C. elegans with more biological replicates and stronger statistical power. However, my comments on BCAA changes in 293 cells have not been addressed. To demonstrate the evolutionary conservation, the authors showed that the cysteine variant is associated with increased cell proliferation in 293 cells treated with arsenic. They suggest that this increased cell proliferation is due to the induction of BCAA levels. However this is not supported by the data using 293 cells.

https://doi.org/10.7554/eLife.40260.080

Author response

Reviewer #1:

This paper provides an unusually broad and comprehensive approach of moving from natural variation for arsenic response in C. elegans to genetic mapping to gene identification to biochemical characterization. Overall, the topic is very well introduced and motivated. In some ways, the scope is almost too large for any single reviewer to assess. For this purpose, I will therefore concentrate on the genetics side of the work and let others assess the biochemistry and health implications (which does feel a bit cloying at times).

The authors use a four-fold strategy to identify the arsenic resistance locus: QTL mapping, NIL construction, GWAS, and genetic transformation via CRISPR. There can be little doubt that they have identified a gene that generates increased arsenic resistance the CB4856 line, as the allele-specific replacements via CRISPR would be the gold standard for this. On this basis, the fundamental results and conclusions of the paper appear to be on a solid foundation, providing an appropriate context for the biochemical characterization that forms the second half of the paper.

We thank the reviewer for noticing the comprehensive analyses we provided and are happy to hear that the “fundamental results and conclusions of the paper appear to be on a solid foundation.”

The primary concerns that I have with the work relate to the lead in to mapping results, especially around the GWAS results and how they are portrayed. While the paper definitely appears to identify "a gene" for arsenic resistance, we are given very little context for evaluating how important this gene is in the context of genetic variation for arsenic resistance per se. Instead, the mapping results are used to move directly to proposed significance in human health, but we do not really know what the relevance to the worm themselves are in the first place.

We thank the reviewer for these observations. We addressed this criticism by quantifying and more thoroughly reporting the effect sizes of the traits that we discussed. Specifically, we included the following:

1) Effect-size calculations from the dose-response experiment (Figure 1—figure supplement 3; Figure 1—source data 3).

2) Heritability estimates from the dose-response experiment (Figure 1—figure supplement 3; Figure 1—source data 2).

3) Heritability estimates for all traits measured in the RIAIL population (Figure 1—figure supplement 7; Figure 1—source data 12)

4) Heritability estimates for all traits measured in the wild strain populations (Figure 2—figure supplement 2; Figure 2—source data 6)

5) Explicit mentions throughout the text of the effect sizes for all experiments following the mapping experiments, which include the NIL, allele-swap, and C15ISO rescue experiments. In all cases we relate our findings in these experiments to the results originally obtained in the linkage mapping experiment because the strains we used for follow-up experiments were N2 and CB4856, which were the parental strains used to generate the RIAIL population.

This paper clearly approaches this problem from a quantitative genetic perspective, and yet we are provided very little quantitative genetic context in which to evaluate the mapping approach, especially for the GWAS. Specifically, there is no presentation of a partitioning of within and among line variance, which is necessary to understand what kind of expectations one might have in terms of mapping outcomes, especially with respect to power. In the "transparent reporting" document that accompanies the manuscript, it is stated that "Replicates and power were determined by the throughput required for the particular assay or variance explained." First, it is not even clear what this statement means other than waving off the reporting standard being imposed by eLife. More importantly, however, this simply cannot be the case for the GWAS-or at least it cannot be evaluated in the absence of some sort of analysis of the quantitative genetics of the traits involved.

To attempt to address this concern, at a fairly high degree of difficulty, I worked to figure out how the data for this experiment actually worked. While we are at a brave new world of open data, which is a good thing, there could be some debate as to whether the data standard should be to simply make all raw data available or if there should be some effort to curate the data such that an external reader is told what the columns mean (e.g., what is the role of stage in the statistical outcomes), which are critical for evaluation of the results (e.g., temporal blocking), etc.

We sincerely apologize for the perceived failure to report the data formats used throughout the manuscript. To address this issue, we included detailed descriptions of all columns for all Supplementary Files used in the manuscript. This information can be found at the end of the manuscript under the section labelled “Format of Supplementary Files.”

I note that the Materials and methods section on "statistical analysis" outlines the most trivial model possible, so I assume that is the approach used throughout the paper. In the absence of this information, I simply tried to use a GLM analysis of the GWAS population samples to estimate (via ML) the within and among line variances across environmental treatments. I found that the among-strain variance accounted for about 1-2% of the total variance observed among individuals, while the condition-by-strain variance explained about 2-10%. Now this is for tens of thousands of observations, so these estimates are for the most part quite significant.

We thank the reviewer for taking the time to perform statistical analysis on the GWAS data set. To confirm these results, we performed the same analysis using the raw BIOSORT data and reached the same conclusion as the reviewer. Unfortunately, these data are not what is used in our analyses. In the reviewer’s analysis, one would have to consider each observation (animal) from a population of growing animals in one well of a 96-well plate to be an independent replicate. In that well, the animals have experienced the same preparation procedure, the same bacterial food concentration, and the same environment. For this reason, we use summary statistics for each replicate population of animals grown in the same well and not single animal measurements. Our linkage and GWA mapping experiments only used a single replicate population of animals from each RIAIL or wild isolate, respectively. To make our procedure more clear, we included a “Format of Supplementary Files” section at the end of the manuscript that contains detailed information of each Supplementary File. Additionally, we added more detail to the “Calculation of fitness traits for genetic mappings” section to better describe what traits were used for the mapping and follow-up experiments.

However, it is the total among-strain (or strain-by-condition) variance that will determine the power for identifying any particular marker as significant as opposed to the overall heritability. I have little doubt that my quick and dirty look at this is flawed. However, I would expect to see something like this in the paper itself, at least as a supplementary table. Lack of this analysis also does not allow the authors to say anything about effect sizes, which are also notably absent from the paper.

As mentioned above, we have included effect-size calculations (Figure 1—source data 3) and broad-sense heritability estimates (Figure 1—source data 2) from the dose response experiment as supplemental data files. We have also reported effect sizes for the chromosome II QTL, for the NILto recapitulate the QTL effect and interval, for the allele-swap to define causality of dbt-1, and for the fatty acid supplementation experiments to define more of the mechanism for the effect of arsenic on dbt-1 variation.

It is perhaps then not surprising that no results turned up for the single-trait analysis.

We are sorry that the reviewer missed our summaries and other supplemental data. Our goal was to keep the analyzed traits (brood size [norm.n] and animal length [median.TOF]) consistent throughout all experiments in the manuscript. To address this comment, we included GWA mapping results for all traits with significant QTL (Figure 2—source data 7) and included a summary plot of these results (Figure 2—figure supplement 3).These results show that non-PCA-transformed traits also map to the center of chromosome II and overlap with the QTL identified by linkage mapping. To emphasize this point, it is not just principal component traits that mapped to this QTL; 26 traits from our high-throughput fitness assays mapped to the same position. Our desire to make the traits more understandable led this reviewer down to the unfortunate conclusion that the population-wide differences in arsenic were not significant.

The authors therefore turn to a PCA approach, which itself features some fairly unusual aspects in terms of presentation. There appear to be five traits measured on each individual, yet the PCA is based on 67 measurements ("traits"). These are apparently largely summary statistics for individuals from each strain.

This assessment is correct. Please refer to our response above for further details regarding our rationale for this approach.

The PCA must therefore have been generated via the among-line statistics in the absence of within-line variation, which is fine.

We want to reiterate that we have no measure of within-line variation for the GWA mapping dataset. As we describe above, we have made the Materials and methods section more clear in this regard.

What seems less fine is that the "traits" used here are actually all derivatives of the original five traits and are mostly descriptions of the within-line distributions. This means that many of the measures are functions of one another (e.g., the mean and CV). Indeed, I am surprised that the PCA works at all, as the correlation matrix for this set of traits should be singular.

It is true that the traits we used for PCA are linearly codependent, and the determinant of the trait correlation is essentially zero (1.21e-126). However, because PCA is not iterative, in that only one eigen-decomposition step is performed, it does not require independent variables (traits) (e.g. a singular trait correlation matrix results in eigenvalues that approach zero). We therefore used PCA to eliminate the collinearity of the summary statistic traits. As expected from the colinearity of our traits, the PC loadings for the first principal component are very similar to each other. To further supplement this analysis, we included supplementary figures and files of the trait correlations and PC loadings for all experiments.

As importantly, the new PCA "traits" that emerge from such an analysis essentially have no biological meaning. They are almost entirely statistical constructions. I cannot think of any valid reason to do this, unless variance per se is the trait of interest, in which case that should be directly addressed in a much different fashion.

Respectfully, we disagree with this assessment. The “first principal component trait” (PC1) is a linear combination of the input traits. In the original manuscript, we showed that PC1 is highly correlated with traits associated with animal length (Figure 1—figure supplement 4-5), which is a proxy of developmental progression in our fitness assay. In an attempt to further assign biological meaning to the PC1 trait, we included PC1 vs animal length or brood size correlation plots and loadings for all experiments (Figure 1—figure supplements 2,4,10, and 11; Figure 2—figure supplement 1; Figure 3—figure supplement 2; Figure 4—figure supplement 6-7) in the revised manuscript.

Unless motivated quite differently, the PCA should be conducted on just the original five traits.

We have included additional justification for the use of using PCA throughout the manuscript. Furthermore, we relate the first principal component trait to phenotypes measured using the high-throughput fitness assay.

It is also odd that the interpretation for the PCA results is then motivated by a regression of PCA1 on body length (Supplementary Figure 4). That is exactly what the PCA loading (eigenvector coefficient) provides in the PCA analysis.

We agree with the reviewer that principal component loadings describe the same thing as what was shown in Supplemental Figure 4 of the original manuscript. However, our goal was to provide general audience eLife readers with a more intuitive representation of the correlation between principal components and animal length or brood size traits. Because not every reader will be as keenly aware of the intricacies of PCA as the reviewer, we think that this visualization is helpful. As stated above, we report the PC loadings for all experiments in the updated manuscript.

If the PCA is to be conducted, then it should be done so in the context of a full analysis, which would include reporting the factor loadings.

We have included all of the requested files (Figure 1—source data 4-6, 10, and 15; Figure 2—source data 2, Figure 4—source data 7) and corresponding figures (Figure 1B—figure supplements 2,4,10, and 11; Figure 2—figure supplement 1; Figure 3—figure supplement 2; Figure 4—figure supplement 6-7).

It seems that nearly all of these features simply measure how big something is (e.g., color brightness) and that including each of these measures into a single composite parameter allows the GWAS to cross a variance threshold that provides the single significant point. The PCA itself does not completely make sense from a pure statistical point of view, and so it appears that this mapping result is the primary motivation for the approach.

We hope that the reviewer understands the approach better after the clarifications above and our edits to the manuscript.

A final mapping oddity is that the authors report that they use full DNA strain-specific sequencing results to calculate the relatedness matrix, but only use much older RAD marker data to do the actual mapping.

We appreciate this criticism. The use of the RAD marker data since its original publication in 2012 has been to keep markers consistent among different C. elegans GWA mapping experiments. We agree that this approach makes less sense with genome-wide sequence data available. We found (though we did not discuss in the original manuscript) that these markers are sufficient to “tag” all strain haplotypes present in 97 wild isolates made available by Andersen et al., 2012. The 86 strains that we phenotyped in the presence of arsenic trioxide were drawn from this same 97 strain population. We reasoned that the same marker set would be sufficient for GWA mapping. However, we updated our GWA mapping approach to alleviate any possible concerns the reviewer might have about our choice of markers. Here is a brief summary that we detail in the Materials and methods section:

We started with the imputed VCF that can be found at https://elegansvariation.org/data/release/latest. We removed variants with minor allele frequency less than 5% and then removed highly correlated markers ((r^2) > 0.8). Finally, we kept sites where all strain genotypes were present. This pruned marker set contains 59,241 markers. We used this marker set to construct the relatedness matrix and perform GWA mapping on the 86 phenotyped strains. We have included the resulting marker set in Figure 2—source data 5.

The only motivation that I can see for this is to reduce the number of comparisons so that the significance threshold does not become too high. Whatever the motivation, it is not specified by the authors, nor are potential power issues addressed in the paper itself or in the "transparent reporting" document.

We are sorry that the reviewer evaluated our analysis and came to this conclusion. As we described, a large number of traits map to the same QTL on chromosome II above a Bonferroni-corrected significance threshold. The updated manuscript has several changes. To improve our mapping techniques, we included genome-wide markers and then used two different methods for setting significance thresholds. The first is the Bonferroni-corrected (BF) threshold as we already described. The BF threshold for the updated marker set is ~ 8.4e-07, corresponding a -log10(p) of ~ 6. In the second method for multiple-testing correction, we performed spectral decomposition of the genotype correlation matrix from the genome-wide marker data set as performed previously. We set all eigenvalues greater than one to one and then summed the all of the eigenvalues to identify the number of independent tests in the marker set. This analysis revealed that there are 500.761 independent tests in our marker set of 59,241 markers, corresponding to a significance threshold of 1e-04 or a -log10(p) of 4. We have included a detailed explanation of this approach in the Materials and methods section.

Taken together, the GWAS approach appears to have the hallmarks of "p-hacking." In other words, the primary goal of the analysis is to ensure that there is at least one red dot on Figure 2, such that it supports the conclusions of the QTL mapping.

We are convinced that the main drivers of the reviewer reaching this conclusion are (1) a misunderstanding of the input data structure and (2) marker set choice. We have addressed both of these points above. However, the reviewer’s suspicion motivated us to update our mapping approach, and we are thankful for this input.

The result of our updated GWA mapping pipeline resulted in 26 of the 64 traits that we used as input for PCA to map to the center of chromosome II. As for the PC1 trait, the p-value for the peak marker of the chromosome II QTL is ~1.48e-07, which is nearly an order of magnitude greater than the Bonferroni-corrected significance threshold and three orders of magnitude greater than the spectral decomposition threshold described above. If we were to take multiple testing correction to the extreme and set the number of tests to be 64 traits tested * 60,000 markers we end up with ~ 4 million tests (BF threshold: ~1.3e-8) and 33,000 tests (decomposition threshold: ~1.5e-6). While the PC1 QTL peak p-value does not quite pass the BF significance threshold in this extreme case, it does pass the decomposition threshold by an order of magnitude. We note that some of the other traits do pass this extreme threshold (mean.yellow: p-value at peak marker ~ 2.6e-10) However, as the reviewer pointed out, we do not have 64 independent traits because they are all summary statistics of five traits quantified by the BIOSORT. This observation is supported by our observation that four principal components capture >90% of the total variance in the GWAS phenotype data set. If we were to perform multiple testing correction across these four traits, which we note are still not fully independent, the BF threshold is ~2.1e-07, which is nearly two times higher than the p-value for the chromosome II QTL peak marker obtained from mapping the PC1 trait, making our mappings significant.

I am not saying that this is in fact what the authors have done; simply that one gets that impression because of the statistical gymnastics that it seems to take to get here, coupled with a realistic assessment of what is actually possible to map for these traits in these populations.

Again, we are sorry that the reviewer reached this conclusion. We hope our explanations and re-analyses are sufficiently convincing.

This view is further supported by the singular and assertive nature of the narrative, in which limitations and alternative explanations are not entertained. I fully appreciate that this is the nature of the style of presentation in top tier journals such as eLife, in which a clear narrative coupled to a specific mechanism seems to be a requirement for publication. However, eLife's new data standards seem to be at odds with this cultural tradition. They need to figure out how they actually want to have this work. I do feel for the authors that they may be caught between an editorial expectation for clarity and the views of this specific reviewer. And I may be entirely wrong in this, which makes this specific form of publication cycle more awkward than a traditional feedback and response pattern around concerns such as this.

Our main goal was to describe the results of our analyses that clearly show variation in the dbt-1 locus contributes to differential arsenic toxicity in the C. elegans population. We have provided all raw and processed data, included a description of data processing and mapping procedures, and expanded our reported traits to emphasize that this QTL is highly significant. The reviewer’s comments have improved the overall clarity of the manuscript and results. We are grateful for the input.

Similar concerns exist in the QTL mapping experiment, although the results appear to be clearer here. By my calculations the among strain variance for the RIAILs tends to be around 1-2% and the strain-by-condition variance tends to be 2-8%.

The phenotyping of the RIAILs used for linkage mapping was performed in the same manner as we described above for the wild isolates used for GWA mapping. Specifically, we did not quantify replicates of the RIAIL phenotypes, instead we relied on the replication of RIAIL genotypes being derived from two parental strains. This precludes proper analysis of variance for this particular data set, but we are able to estimate variance components based on genetic relatedness across the genome, as we describe below.

Analysis of the genetic variance here is approach to the interpretation of the results, especially from the broader worm natural variation point of view, which I recognize is not a point of emphasis in the majority of the paper. Nevertheless, estimates of QTL effect sizes will help the reader interpret both these results and the interpretation of the population genetic and other functional explanations later in the paper.

It is true that our main emphasis was not focused on a more quantitative genetic analysis, including genetic variance in our dataset. However, to address the reviewers concerns thoroughly, we added the following analyses:

1) We have provided effect size estimates for the dose response experiment for all measured and principal component traits. We report different measures of effect size including η2, partial η2, ω2, partial ω2, and Cohen’s F. Additionally, we included broad-sense heritability estimates from this experiment. – Supplementary Files 2-3 and Figure 1—figure supplement 3.

2) We have included genomic heritability estimates of all BIOSORT and principal component traits used for linkage mapping. We provide two estimates of narrow- and broad-sense heritability using a linear mixed-effect model with the equation y=Xβ+Zu+ϵ. The estimates differ in their formulation for the strain additive relatedness matrix, realized relatedness matrices correct for allele frequencies and the expectation matrix does not. For both estimates, the Hadamard product of the additive relatedness matrix was used to calculate the epistatic relatedness matrix (Figure 1—figure supplement 7; Figure 1—source data 12).

3) We have provided effect size estimates for the QTL identified by linkage mapping (Figure 1—source data 11), including discussion of how much of the total additive variance is explained by the identified QTL.

4) We have included presentation of the QTL effect size identified by GWAS in the manuscript, as well as discussion of variance partitioning (Figure 2—figure supplement 2; Figure 2—source data 6).

5) We also present how much of the parental phenotypic difference can be explained by the NILs and genome-edited allele-replacement strains.

The gem of the work is the allele-specific transformation that clearly demonstrates that DBT-1 has a significant effect on arsenic resistance. I do not know why these transformations were done only on the parental lines and not the NILs, although this work can be technically demanding. The latter transformations would be just a little more convincing in proving that this is actually the causal locus from the mapping experiments and not simply "a locus" that influences arsenic resistance, since there is plenty of residual difference among the parents in the transformants. However, the totality of the evidence definitely supports this interpretation, and this work is well beyond the standard in the field on this account.

We agree with the reviewer that the allele-specific replacements clearly demonstrate that DBT-1 variation has a significant effect on arsenic resistance. We did not find it necessary to introduce the allele-replacements in the NIL genetic background because these genome-edited strains completely recapitulated the NIL effect. However, we realize that, because we did not include more discussion of effect size recapitulation by the NIL and allele-swap strains, it may not be immediately obvious that the DBT-1(C78S) allele completely recapitulated the QTL effect identified via linkage mapping. Specifically, the NILs and allele-swap strains account for ~64-92% of the parental difference in the presence of arsenic trioxide and the chromosome II QTL, which explains ~33% of the phenotypic variation in the RIAIL population, accounts for ~61.7% of the total phenotypic variance explained by genetic factors (H2~ 0.53 for PC1). The discrepancy between the mapping and follow-up experiments is within error of the heritability estimate. We included discussion of these points in the revised manuscript.

I do not understand why there are no error bars on the selection experiments reported in Figure 5. Nor are there any statistical tests associated with these results that I can see. I found the raw data for these results impossible to interpret (Data File 23). Some quite important appears to be missing here, making this section impossible to interpret from a rigor and reproducibility point of view, much less in terms of scientific interpretation. I must be missing something here, so I apologize if it should be more obvious to me.

We updated this figure to include error bars, and we calculated significance associated with differences in read counts across all replicates using Fisher’s exact test (Figure 5—source data 2). We also updated the description of the data file corresponding to these read data to make it more easily interpretable for future readers.

While I appreciate the desire to posit an adaptive significance on the identified alleles, we know that the central regions of C. elegans chromosomes (like the center of chromosome 2), so clear signs of global selective sweeps.

It is our understanding that chromosomes I, IV, V, and X have undergone selective sweeps, not chromosome II. Nevertheless, we explicitly state that the genomic region surrounding dbt-1 has no signature of selection as indicated by Tajima’s D, and we did not explore this possibility further.

It seems just, if not more, likely that this allele is simply along for the ride in a linked haplotype and does not in fact have any functional significance in natural populations. It is all well and good to suggest that more work is needed to explore the environmental circumstances that might generate selection, but it is somewhat disingenuous in this species to not discuss the potential for neutral processes in structuring this variation.

We have added to the Discussion section to address the possibility of no adaptive advantage associated with this allele.

As noted above, in general, caveats and alternative explanations are not a feature of this work, which tells a singular story from a very particular point of view. This does not make that view incorrect, but it does not seem to me to be the most rigorous way to present what is a truly impressive amount of work on a very difficult question.

We added alternative explanations throughout the manuscript as described in our responses to the previous comments from this reviewer. We hope that these additions have shown the breadth of our analyses that led to an interesting discovery.

Reviewer #2:

[…] There are one major and one minor issue that should be addressed before publication.

1) Major: The cysteine variant is associated with increased arsenic sensitivity in C. elegans, but with increased cell proliferation in 293 cells treated with arsenic. The authors suggest that the increased cell proliferation is due to the induction of BCAA levels. However this is not supported by the data using 293 cells. In C. elegans, the cysteine variant is associated with the induction of BCAA levels with the CB4856 background (Figure 4B). With the N2 background, the induction trend is observed but not significant different (Figure 4B). But the N2-DBT-1(S78) without arsenic has only two data sets, which might reduce the power of statistical analysis.

We thank the reviewer for this observation. We thought that, in the context of the CB4856 and CB4856 allele swap result, the conclusion was solid as is. However, we performed the L1 larval stage metabolite profiling experiment at higher replication to address the reviewers’ point. For this experiment, we only included the N2 and N2 allele swap strains because these are the relevant strains to make the conclusion that there is a DBT-1 allele-specific effect on branched-chain fatty acid production. We acquired an additional six independent paired 100 µM arsenic trioxide-control replicates to strengthen our conclusions. The results of this experiment are described in the “Arsenic trioxide inhibits the DBT-1 C78 allele” section of the manuscript and shown in Figure 4B and Figure 4—figure supplements 2-4. The results from this experiment support the conclusion that strains with the DBT-1(C78) have higher branched/straight-chain ratios relative to strains with the DBT-1(S78) for the C17 (CB4856 DBT-1(C78): Tukey HSD p-value = 0.164721, n=3; N2 DBT-1(C78): Tukey HSD p-value = 0.003747, n=6) and C15 ratios (CB4856: Tukey HSD p-value = 0.0427733, n=3; N2: Tukey HSD p-value = 0.0358, n=6). We note that the C17ISO/C17SC ratio is not significantly different when comparing the CB4856 and CB4856 allele swap strain, but the direction of effect matches our other observations. Furthermore, we show that there are DBT-1 allele-specific differences in C15ISO and C17ISO for all strains in control conditions (CB4856-C15ISO DBT-1(C78): Tukey HSD p-value = 0.0036201, n=3; N2-C15ISO DBT-1(C78): Tukey HSD p-value = 0.0265059, n=6; CB4856-C17ISO DBT-1(C78): Tukey HSD p-value = 0.0086572, n=3; N2-C17ISO DBT-1(C78): Tukey HSD p-value = 0.0022501, n=6).Importantly, the DBT-1 allele-specific differences in the fatty acid ratio and ISO measurements were not driven by differences in straight-chain fatty acids.

In addition to increasing the replication of the L1 assay, we quantified the effects of the DBT-1(C78S) allele in young adult animals to see if the conclusions held true across different developmental stages. For this experiment, we included six independent replicates for the N2, CB4856, and allele-swap strains. In contrast to the results at the L1 developmental stage, we did not observe the same effect of the DBT-1(C78S) allele at the young adult life stage. The results of this experiment are shown in Figure 4—figure supplements 5.

Taken together, these results suggest that the DBT-1(C78) allele produces more branched chain fatty acids than the DBT-1(S78) allele, but this effect is dependent on the developmental stage of the animals.

2) Minor: To avoid unnecessary confusion, the authors might consider changing sample labeling in Figure 4C to be consistent with other figure panels. N2 DBT-1 (S78) instead of N2 DBT-1 (C78S), CB4856 DBT-1 (C78) instead of CB4856 DBT-1 (S78C).

We updated Figure 4C to make the genotypes clearer.

[Editors' note: the evaluation of the revised submission follows.]

Reviewer 1 in particular remains unconvinced regarding the justification for the PCA. Reviewer 2 notes that it has not been demonstrated that the increased cell proliferation in HEK293 cells is due to increased BCAA levels. More details are in the reviews below.

Reviewer #1:

[…] With that preamble, I appreciate that the authors have clarified a great deal of the analysis, particularly the use of line means in the quantitative genetic analysis. The revision remains as the initial submission: a comprehensive analysis of natural variation in arsenic sensitivity. The overall conclusions are still well supported. Nevertheless, I find that the multivariate approach for the mapping analysis is not well justified. This need not be blown out of proportion, but must still be pointed out.

1) The authors have now greatly clarified the conversion of the raw observations into the mapping approach. This was a bit opaque before, but it is now clear that this is a line-means analysis, which is perfectly appropriate for mapping and there newly reported results show that there is substantial among-line variation (H2 for their traits). My original point relative to inferences within natural populations remains, however. These line-based differences actually explain very little about the susceptibility of individual worms to arsenic, which is largely due to random effects. The experimental design concerns discussed here mean that the value that I calculated are actually upper bound to the actual broad sense heritability in the population. This may or may not be important for the problem at hand, but it is important for the interpretation of these results from the standpoint of worm populations.

Thank you for your comment about the revision. We added substantial explanations, additional data, and a data dictionary to explain our approach. It is a line-means analysis. Because C. elegans is a selfing hermaphrodite with little diversity at the local level, most populations are nearly clonal. Our approach does not address the influences on individual animals but does address what a nearly genetically identical population would encounter in the wild. We added some details to the Results about how this analysis might not be applied easily to natural populations at the individual level.

2) In the first version of the paper, the authors reported that they did not identify any significant peaks for the single trait analysis so they performed the mapping on the PCA. This is what was meant by "no results turned up for the single-trait analysis". The reviewer did not miss the summaries and supplementary materials. It did not mean that no results were presented, which is something quite different. The current manuscript has been modified to simply gloss over these issues and go directly to the PCA. While the authors "respectfully disagree" with criticisms of the PCA approach, they do not address the issues. The approach is statistically, mathematically, and biologically invalid. There is simply no justification for throwing summary statistics that are functions of one another into a PCA except to try to minimize variance. Showing that an arbitrary and invalid PCA is correlated with a biological measure begs the question as why one would not simply use the measure itself. Yes, it is possible to perform a PCA on a singular covariance matrix, but the outcomes tend to be highly unstable in the face of sampling variance. It is the eigenvectors that are the elements that are subject to interpretation. It is still the case that nothing about this analysis makes sense except that it yields a result that is consistent with the other, more convincing parts of the paper. It still feels that the main goal here was to derive a trait that yielded a significant result. The non-significant pileups of the other traits could have been subjected to joint probability test to achieve a similar goal in a much more straightforward and interpretable fashion.

We are sorry about this issue. As the reviewer points out, the PCA approach was used in the first version of the paper to reduce the noise from the 67 individual trait measures. As we stated before, the approach was not used to yield a result “consistent with the other, more convincing parts of the paper”. The original genome-wide association mappings in the first version of the manuscript used the RAD-sequencing marker set and a Bonferroni-corrected p-value threshold on only two traits (brood size and median animal length). For these two traits, we did not map a QTL to the center of chromosome II, which we hypothesized was caused by noisy traits measured from single replicates of a small number of wild strains. However, other traits did map to this chromosome II region as we presented in the resubmission. For this reason, we used the PCA approach to “clean” up the correlated traits and performed the mappings to find a QTL on chromosome II. It was not performed after the linkage mappings, NIL experiments, and allele-replacement tests. As we described previously, the overlap of these two mapping approaches is what inspired us to perform the follow-up experiments. In the revision (and thanks to the reviewer’s comments), we updated our mapping algorithm, the marker data set, added a new significance threshold, and included the mapping results of all of the traits used as inputs to PCA. These changes enabled significant GWA mappings (Bonferroni correction) for 26 of 64 of the summary BIOSORT-measured traits and the major PC to the same position on the center of chromosome II (Author response image 1). All trait mapping results from the original revision can be found on FigShare (https://doi.org/10.6084/m9.figshare.7458932.v2). In our revision, we attempted to justify the use of PCA again as a data cleaning method to show that the correlated summary statistics were loaded into the first PC. After the reviewer’s latest comments and discussions with mathematicians, we understand the limitations of using PCA to “clean” up correlated summary statistics as opposed to looking for relationships among seemingly unrelated traits. We would like to point out again that, in our original revision, 26 of 64 statistical summary traits, which relate directly to the biology of C. elegans (e.g. the 90th quantile of animal length), map significantly (Author response image 1) to the same position on chromosome II. To follow the reviewer’s suggestion and the editor’s request for a more appropriate PCA approach, we only used four traits (brood size: norm.n; animal length: mean.TOF; optical density: mean.norm.EXT; and fluorescence: mean.norm.yellow), which are each measured independently from each other (i.e. different lasers/PMTs or counts on the COPAS BIOSORT), as inputs for PCA. As mentioned above, these four traits are correlated, so it is clear that PC1 is capturing overall arsenic-induced toxicity, and this PC mapped to the center of chromosome II. We use this PC trait across all figures to keep consistency across all analyses. It is important to note that this PC is also highly correlated with biologically meaningful traits like population animal length and optical density. The overlap of association and linkage mapping experiments led us to the interval tested using NILs and containing the allele-replacement validated gene dbt-1. Of the independent traits that we used as inputs for the PCA, only the brood size trait alone did not map to the center of chromosome II. We also include the analysis and results of these four independently quantified traits for all follow-up experiments. We would like to emphasize that this new analysis does not change any of the conclusions of the manuscript, but we hope these final changes address the reviewer’s remaining concerns regarding the PCA approach.

Author response image 1
GWA mapping QTL summary.

All QTL identified by GWA mapping are shown. Traits are labeled on the y-axis, and the genomic position in Mb is plotted on the x-axis. Triangles represent the peak QTL position, and bars represent the associated QTL region of interest. Triangles and bars are colored based on the significance value, where red colors correspond to higher significance values.

https://doi.org/10.7554/eLife.40260.076

3) The genomic approach and markers used in the GWAS is now much clearer.

Thank you. We are much happier with the new GWAS approach, which was motivated by the reviewer’s initial concerns. As we explained in the manuscript and above, the majority of traits, including the first principal component trait, had at least one significant QTL on chromosome II overlapping dbt-1. These results show that natural variation in arsenic responses across C. elegans wild strains maps by linkage mapping to the center of chromosome II (shown here as Author response Image 2).

Author response image 2
Linkage mapping QTL summary.

All QTL identified by linkage mapping are shown. Traits are labeled on the y-axis, and the genomic position in Mb is plotted on the x-axis. Triangles represent the peak QTL position, and bars represent the associated 1.5-LOD drop QTL confidence interval. Triangles and bars are colored based on the LOD score, where red colors correspond to higher LOD values.

https://doi.org/10.7554/eLife.40260.077

4) The selection experiments in the human cell lines now contain error bars and significance tests, which is obviously a good thing. These effects are obvious not particularly large and selection among human cells seems like it could be subject to a number of complicating interpretations, but the result is well consistent with the overall results and interpretation.

Thank you for these comments.

Reviewer #2:

In the revised manuscript, the authors have addressed my concern on BCAA changes in C. elegans with more biological replicates and stronger statistical power. However, my comments on BCAA changes in 293 cells have not been addressed. To demonstrate the evolutionary conservation, the authors showed that the cysteine variant is associated with increased cell proliferation in 293 cells treated with arsenic. They suggest that this increased cell proliferation is due to the induction of BCAA levels. However this is not supported by the data using 293 cells.

This error was ours and should have been avoided. During the editing process between different software packages, the offending sentence was not removed in the submitted revision. We have removed this sentence and also tightened the arguments in human cells (final section of the Results). We hope that these changes have alleviated this concern.

https://doi.org/10.7554/eLife.40260.081

Article and author information

Author details

  1. Stefan Zdraljevic

    1. Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, United States
    2. Department of Molecular Biosciences, Northwestern University, Evanston, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2883-4616
  2. Bennett William Fox

    Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, United States
    Contribution
    Data curation, Formal analysis, Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  3. Christine Strand

    Broad Institute of MIT and Harvard, Cambridge, United States
    Contribution
    Data curation, Formal analysis, Validation, Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  4. Oishika Panda

    1. Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, United States
    2. The Buck Institute for Research on Aging, Novato, United States
    Contribution
    Data curation, Formal analysis, Investigation
    Competing interests
    No competing interests declared
  5. Francisco J Tenjo

    Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  6. Shannon C Brady

    1. Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, United States
    2. Department of Molecular Biosciences, Northwestern University, Evanston, United States
    Contribution
    Resources, Writing—review and editing
    Competing interests
    No competing interests declared
  7. Tim A Crombie

    Department of Molecular Biosciences, Northwestern University, Evanston, United States
    Contribution
    Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  8. John G Doench

    Broad Institute of MIT and Harvard, Cambridge, United States
    Contribution
    Resources, Supervision, Methodology, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3707-9889
  9. Frank C Schroeder

    Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, United States
    Contribution
    Supervision, Funding acquisition, Methodology, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
  10. Erik C Andersen

    1. Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, United States
    2. Department of Molecular Biosciences, Northwestern University, Evanston, United States
    3. Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Northwestern University, Chicago, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Methodology, Project administration, Writing—review and editing
    For correspondence
    erik.andersen@northwestern.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0229-9651

Funding

National Institute of General Medical Sciences (T32GM008061)

  • Stefan Zdraljevic

National Institute of Diabetes and Digestive and Kidney Diseases (DK115690)

  • Frank C Schroeder
  • Erik C Andersen

American Cancer Society (127313-RSG-15-135-01-DD)

  • Frank C Schroeder

National Institute of Environmental Health Sciences (ES029930)

  • Erik C Andersen

National Institute of General Medical Sciences (GM088290)

  • Erik C Andersen

Next Generation Fund

  • Erik C Andersen

National Institute of Environmental Health Sciences (ES029930)

  • Erik C Andersen

Sherman-Fairchild Cancer Innovation Award

  • Erik C Andersen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

The authors thank Samuel Rosenberg for assistance on early mappings of drug sensitivities, Mudra Hegde of the Broad Institute for assistance with sequence analysis, and members of the Andersen laboratory for critical reading of this manuscript.

Senior Editor

  1. Michael A Marletta, University of California, Berkeley, United States

Reviewing Editor

  1. Piali Sengupta, Brandeis University, United States

Reviewer

  1. Meng C Wang, Baylor College of Medicine, HHMI, United States

Publication history

  1. Received: July 22, 2018
  2. Accepted: March 26, 2019
  3. Version of Record published: April 8, 2019 (version 1)

Copyright

© 2019, Zdraljevic et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 655
    Page views
  • 90
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Evolutionary Biology
    Alessandro W Rossoni et al.
    Research Article Updated
    1. Evolutionary Biology
    Carolin M Kobras, Daniel Falush
    Insight