1. Computational and Systems Biology
  2. Immunology and Inflammation
Download icon

Deciphering the combinatorial landscape of immunity

  1. Antonio Cappuccio
  2. Shane T Jensen
  3. Boris M Hartmann
  4. Stuart C Sealfon
  5. Vassili Soumelis  Is a corresponding author
  6. Elena Zaslavsky  Is a corresponding author
  1. Institut Curie, Integrative Biology of Human Dendritic Cells and T Cells Laboratory, PSL Research University, Inserm, U932, France
  2. Department of Neurology, Icahn School of Medicine at Mount Sinai, United States
  3. Department of Statistics, Wharton School, University of Pennsylvania, United States
  4. Laboratoire d'immunologie, biologie et histocompatibilité, AP-HP, Hôpital Saint-Louis, France
Tools and Resources
  • Cited 1
  • Views 878
  • Annotations
Cite this article as: eLife 2020;9:e62148 doi: 10.7554/eLife.62148

Abstract

From cellular activation to drug combinations, immunological responses are shaped by the action of multiple stimuli. Synergistic and antagonistic interactions between stimuli play major roles in shaping immune processes. To understand combinatorial regulation, we present the immune Synergistic/Antagonistic Interaction Learner (iSAIL). iSAIL includes a machine learning classifier to map and interpret interactions, a curated compendium of immunological combination treatment datasets, and their global integration into a landscape of ~30,000 interactions. The landscape is mined to reveal combinatorial control of interleukins, checkpoints, and other immune modulators. The resource helps elucidate the modulation of a stimulus by interactions with other cofactors, showing that TNF has strikingly different effects depending on co-stimulators. We discover new functional synergies between TNF and IFNβ controlling dendritic cell-T cell crosstalk. Analysis of laboratory or public combination treatment studies with this user-friendly web-based resource will help resolve the complex role of interaction effects on immune processes.

Introduction

Immunological processes are regulated by a broad diversity of stimuli, such as growth factors, cytokines, integrins, and pathogen components. Studies of immune regulation have largely focused on individual stimuli, elucidating transcriptional and functional programs induced by major immune modulators such as TNF (Locksley et al., 2001; Kalliolias and Ivashkiv, 2016; Bouwmeester et al., 2004), interferons (Pestka, 2007; Schneider et al., 2014; González-Navajas et al., 2012), and Toll-like Receptor (TLR) ligands (Kawasaki and Kawai, 2014; Vidya et al., 2018). However, immunological responses are constantly shaped by the combined action of multiple stimuli. As a fundamental example, T helper (Th) cell activation requires a combination of antigen presentation and costimulatory molecules (Smith-Garvin et al., 2009). Due to combinatorial effects, the study of individual stimuli alone may provide a misleading oversimplification of their role in complex microenvironments. Developing a systematic understanding of how combinations of stimuli control the immune system is an unmet challenge, with broad biological and therapeutic implications.

What makes studying immunology from a combinatorial perspective essential is that nonlinear interactions often make the effects of combined stimuli dramatically different from merely additive effects seen with each stimulus alone. Immunological studies have discovered that interaction effects play major roles in shaping immune system processes (Noubade et al., 2014; Min et al., 2012; Tuvim et al., 2012; Walasek et al., 2012; Hartmann et al., 2014; Teles et al., 2013). However, the complexity of these effects and the lack of resolution of current combinatorial experiment analysis methods are a barrier to understanding how interactions regulate immunological functions. The standard approach to this problem is to identify genes showing positive (synergistic) or negative (antagonistic) deviations in the combined treatment condition from the sum of individual effects. This restricted binary classification has critical limitations. First, it does not distinguish between biologically relevant and artifactual interactions that depend on the specific assay (Cappuccio et al., 2015). Second, it does not allow for meaningful aggregation of interactions to delineate coherent functional programs. Finally, this standard approach does not support integration and shared interpretation of interactions from multiple combination treatments. These limitations obscure general properties and functions of interactions between immunological stimuli.

To overcome this barrier, we develop a framework based on combinatorial analysis and machine learning. Using combinatorial analysis, we derive a comprehensive taxonomy of all interaction patterns that can occur in a combination treatment experiment. Next, we train a machine learning classifier able to robustly map noisy -omics data across the pre-defined taxonomy. As we demonstrate, this framework now resolves the most informative synergistic and antagonistic effects. Annotation of genes assigned to each discrete regulatory pattern identifies the regulated immunological functions. In addition, assignment of responses from multiple combinatorial datasets to a standardized taxonomy enables global integration to systematically elucidate the interactions that shape immune functions.

The resulting immune Synergistic and Antagonistic Interaction Learner (iSAIL) resource includes the machine learning classifier, a curated compendium of immunological combination treatment datasets and a global dataset integration to produce a landscape of ~30,000 interactions from diverse immune cell types and combinations of stimuli. We show how the landscape can be mined to reveal new insight into the prevalence of different types of interactions, and into their immunological functions. Further, we show that the resource facilitates the discovery of new combinatorial regulatory processes. iSAIL analysis of data on TNF and IFNβ co-treatment led us to discover new synergistic mechanisms implicated in dendritic cell-T cell crosstalk.

iSAIL is available as a user-friendly web-accessible resource to help elucidate the combinatorial complexity of immunity.

Results

Background and motivation

The prototypical combination treatment experiment comprises -omics data obtained from the following conditions: a vehicle control (denoted by 0), two individual stimuli (X, Y), and their combination (X+Y), Figure 1A, left subpanel. The goal of this experiment is to discover cellular processes regulated by positive (synergistic) and negative (antagonistic) interactions. This requires a fine-grained analysis able to distinguish between biologically relevant and artifactual interactions, and able to aggregate similar interactions to delineate coherent functional programs. The standard approach, which classifies interactions as positive or negative, does not do this adequately, leading to a loss of important biological information. For example, biologically meaningful positive interactions such as emergent responses (e.g. an increase or decrease with the combination in the absence of any effect in the individual treatment conditions), would not be distinguished from nominal interactions with low biological significance related to saturation effects. Furthermore, the aggregation of all the positive (or negative) interactions obscures the inference of coherent biological processes as typically obtained through pathway enrichment analysis, because this crude aggregation confounds very diverse interaction effects.

Figure 1 with 3 supplements see all
Comprehensive approach to map interaction effects within and among -omics datasets from combination treatment experiments.

(A) We developed iSAIL (immune Synergistic/Antagonistic Interaction Learner), a machine learning framework to decipher the effect of combination treatments. iSAIL accommodates -omics datasets from the four prototypical conditions of combination treatments: 0 (control), stimulus X, stimulus Y, and the combination X+Y (left subpanel). The dataset is analyzed by a classifier previously trained to map each gene into a complete taxonomy of theoretically possible interaction profiles (middle subpanel, see also Figure 1—figure supplement 1). The slanting of a single profile rendition in the middle subpanel is to indicate that the slanted pattern is one of an entire deck of synergistic or antagonistic patterns classified by iSAIL. The taxonomy helps infer the functional role of different types of positive and negative interactions (right subpanel). (B) By applying iSAIL to combination treatments from diverse immune cell types (right subpanel), we built a combinatorial landscape of immunity comprising ~30,000 interactions. Global analysis of the landscape and of user-generated data drive new hypotheses on the role of interactions in immune cells.

Machine learning framework to map interaction effects from -omics data from combination treatment studies

To improve the extraction of biological insight from a combination experiment, we develop a machine learning framework within the immune Synergistic and Antagonistic Interaction Learner (iSAIL) resource. The framework increases the resolution of a standard approach by using an abstract, pre-defined taxonomy of 123 distinguishable response profiles that can theoretically occur in a combination treatment experiment (Figure 1—figure supplement 1). These profiles represent qualitatively different scenarios for the expression of a gene in a combination treatment experiment. To analyze data on combination treatment, iSAIL applies a machine learning classifier trained to robustly classify the experimental gene response across the taxonomy (Figure 1A, middle subpanel). Interaction classification makes it possible to resolve the most meaningful interactions (e.g. emergent responses), while automatically excluding from downstream analysis all the nominal interactions (e.g. saturation effects) that carry minimal biological information. Importantly, genes classified in the same interaction profile are likely to share a coherent functional program. These coherent gene groups are then interrogated for mechanistic insight via pathway enrichment analysis (Figure 1A, right subpanel). By breaking down the combinatorial effects into these functional units, we greatly improve the functional interpretation of biological processes regulated by interactions.

A key problem we address is to robustly classify gene response from noisy -omics data across. To solve this problem, the classifier generates a probabilistic assignment of the response for each gene to the taxonomy-defined profiles. This solution is more robust to noise than the deterministic matching previously used for related problems (Cappuccio et al., 2015). We systematically validate our approach by evaluating the classification performance on simulated data. Furthermore, in order to select the optimal framework, we investigate multiple machine learning models and compare their probabilistic output using the synthetic data under different noise regimes. These analyses identify Random Forest as a robust method with high predictive accuracy across all noise regimes (Figure 1—figure supplements 2 and 3).

A resource to analyze user-generated or public immunological combination treatment datasets

The taxonomy of interaction profiles used by the learning framework provides a standardized abstract space, independent of any specific dataset, where interactions from multiple datasets can be mapped and integrated for global analysis. Based on immunological relevance and data quality (see Materials and methods), we select a compendium of 25 human and 7 murine combination treatment studies for inclusion in the resource (Table 1). These datasets include studies of innate immune cells including several dendritic cell populations, macrophages, and monocytes, as well as epithelial cells and adaptive immune cells. Stimuli include diverse combinations of cytokines, Toll-Like Receptor (TLR) ligands, and drugs. We apply our classification framework to each dataset from the compendium and agglomerate the results in an overall interaction landscape, including it as a component of the resource (Figure 1B).

Table 1
Datasets used to construct the combinatorial landscape of immunity.
AccessionSpeciesCell typeSignal XSignal YTime point
GSE5054HumanThyroid cellsIFNγIL1β1d
GSE36331HumanARPE-19 cellsIFNγTNF2d
GSE43409HumanInnate lymphoid cellscocktail (IL-1/IL-7/IL-23)aNKp443.5 hr
GSE53712HumanMonocytic THP-1LPSSB2035804h
GSE53712HumanMonocytic THP-1LPSSB2035801d
GSE59179HumanHut78 cellsEnzastaurinAR-A0144183d
GSE63038HumanNK cellsFcR activationIL-1212 hr
GSE79077HumanMDMsDexamethasoneIFNγ20 hr
GSE57915HumanpDCIL3Flu6h
GSE57915HumanpDCGM-CSFFlu6h
GSE57915HumanpDCGM-CSFFlu1d
GSE57915HumanpDCGM-CSFLL371d
GSE57915HumanMonocytesNOD2TLRs6h
GSE57915HumanMonocytesNOD2TLRs1d
GSE57915HumanMonocytesIFNγTLRs6h
GSE46903HumanMacrophageIFNγTNF3d
GSE46903HumanMacrophageTNFP3C3d
GSE36323HumanMonocytic THP-1D3TsA2.5 hr
GSE52819HumanMacrophageVitamin DH37Rv24 hr
GSE44392HumanCD4+ T celledelfosinebeads30 hr
GSE24767HumanKeratinocyteIL-17TNF1d
GSE77814HumanBMSCIFNγTNF (1.5 ng/ml)2d
GSE77814HumanBMSCIFNγTNF (15 ng/ml)2d
GSE134209HumanmoDCTNFIFNβ1h
GSE134209HumanmoDCTNFIFNβ2.5 hr
GSE20302MouseDCLact acidophilusBifid bifidum10 hr
GSE28994MouseLungPam2CSK4ODN23954h
GSE32986MouseDCCurdlan (1 mg/ml)GM-CSF4h
GSE32986MouseDCCurdlan (100 mg/ml)GM-CSF4h
GSE35291MouseHSPCsValproic acidlithium7d
GSE53986MousemacrophageIFNγLPS1d
GSE62249MouseSB-3123p cellsCocktail (TNF/IFNγ)Vemurafenib4d
  1. Non-standard abbreviations: MDM = Monocyte Derived Macrophages; P3C = Pam3 CSK; D3 = nuclear hormone 1,25(OH)2D3; TsA = trichostatin A; BMSC = Bone Marrow Stromal Cells; HSPCs = hematopoietic stem/progenitor cells; Lact = Lactobacillus; Bifid = Bifidobacterium.

The implemented interaction classification methodology, along with the interaction landscape, is made available in a user-friendly platform. The platform allows researchers to upload and analyze interactions from newly generated data, or to mine the interaction landscape, derived from our compendium, for global insight. To facilitate the functional interpretation of interactions, the platform is integrated with external resources including ImmPort (Bhattacharya et al., 2014), Gene Cards (Stelzer et al., 2016), and enrichR (Kuleshov et al., 2016), that provide extensive single-gene and pathway level annotations. The iSAIL resource has the capacity to generate insight into combinatorial immunity and help guide hypothesis generation and further experimentation.

Exploring and visualizing the interaction landscape

To build the interaction landscape, we systematically applied our interaction classification framework (Figure 2A) to the data compendium, mapping a total of 29,479 interactions. The number of interactions vary widely with the different cell types and combinations of stimuli (Figure 2—figure supplement 1). The landscape, extensively annotated using external databases, is an information-rich object that can be mined to gain new insight on general properties of interactions and on the immunological functions they regulate.

Figure 2 with 1 supplement see all
Building a combinatorial landscape of immunity.

(A) We developed a strategy to map and investigate combinatorial effects from multiple combination treatment experiments. Using public repositories, we selected 32 -omics datasets of combination treatment experiments from diverse immune cells and combinations of stimuli. For each dataset, we applied iSAIL and classified a total of ~30,000 interaction effects. (B) The first seven cards represent the most frequent interactions across datasets. The last two cards represent interactions that occur with vanishingly low frequency. (C) To integrate interactions from different datasets, we created a 3D structure with axes representing datasets, genes, and scores quantifying the intensity and robustness of the effects. The resulting landscape, supplemented with metadata and prior knowledge, makes it possible to comprehensively investigate the effect of combination treatments on immune cells. (D) The plane shows a 2D projection of the landscape focusing on immune checkpoints in selected datasets: (stimulat = stimulatory; inhibit = inhibitory). The size and color of the rectangles keep track respectively of the interaction score and sign (blues: positive interactions, red: negative interactions). The immune checkpoint IDO1 is synergistically induced in multiple datasets. Non-standard abbreviations: P3C = Pam3 CSK; vD = vitamin D; GM = GM CSF; edelf = edelfosine; Flu = influenza virus. (E) By looking at the specific nature of these synergies, we found two cases of potentiation in human moDC (top) and pDC (bottom).

For example, mining the landscape enables a systematic quantification of the prevalence of interactions that may occur with combination treatments, which is currently unknown. We find that the most frequent interactions likely represent range limitations of the assay or biological responses, which we refer to as floor and ceiling effects (Figure 2B, top subpanel). These effects provide limited biological information in the absence of dose-response curves (Chou, 2006), unavailable in our context. Our approach segregates floor and ceiling effects from less frequent but more biologically relevant interactions, including suppression (9%), inhibition (8%), restoration (4%), emergence (4%), and potentiation (3%) (Figure 2B, middle subpanel). Our analysis also reveals that several theoretically possible combinatorial effects were nearly absent (<0.04%). These rare profiles include reversals, where two signals with the same individual effect (e.g. upregulation of a gene separately by X and Y) are reversed by the combination (e.g. downregulation of the same gene by X+Y) (Figure 2B, bottom subpanel).

To explore the immunological functions regulated by interactions, we visualize the landscape using an interactive 3D representation. The three axes represent datasets, genes, and scores quantifying the intensity and robustness of the identified interactions (Figure 2C). Each axis is then annotated with additional information. The dataset axis is annotated with metadata on the experiments including species, cell type, stimuli, and time point. The gene axis is annotated with immunological gene families such as chemokines, interleukins, checkpoints and other terms from the ImmPort database (Bhattacharya et al., 2014). The interaction axis is annotated by the taxonomy profiles assigned to each interaction by the machine learning classifier. Slicing the landscape along specific dimensions provides different types of insight into immunological interactions.

For example, slicing by gene family allows systematic identification of positive and negative interactions involving immune modulators of interest. Figure 2D shows a 2D projection of the landscape involving stimulatory and inhibitory checkpoints for a selection of datasets including macrophage, moDC, pDC, and T cells. Genes in these families show a variable propensity toward synergistic and antagonistic regulation across datasets. While CD40 and CD80 presented sparse, selective interaction effects across datasets, IDO1 shows positive interactions in 5 out of the 15 selected datasets. The profiles of these synergies contain two cases of potentiation from different experimental models, one involving pDC stimulated with IL3 and influenza virus (Flu), and the other involving moDC stimulated with TNF and IFNβ (Figure 2E). Because IDO1 inhibits T cell division and promotes regulatory T cells (van Baren and Van den Eynde, 2015), these positive interactions may serve to restrict overreacting immune responses in the inflammatory microenvironment.

Our results show that mining the interaction landscape can reveal new insight on combinatorial interactions and their immunological functions.

Interactions determine a context-dependent TNF biology

To further illustrate how the interaction landscape can be analyzed, we study how TNF biology can be altered by the presence of cofactors. To this end, we sliced the combinatorial landscape along two axes (Figure 3, top-left). From the dataset axis, we selected combination treatments involving TNF with other stimuli, including IFNγ and IFNβ in four human cell models (Figure 3, left-margin). From the interaction axis, we extracted interaction profiles that encoded a qualitative change of the TNF effect when considered as a mono-treatment. We started by considering three types of qualitative changes: suppression, antagonistic reversal, and synergistic reversal of TNF effects (Figure 3, top-margin).

Combinatorial interactions determine a context-dependent TNF biology.

To explore how cofactors might alter the TNF biology, we sliced the combinatorial landscape (top-left) along two axes. From the dataset axis, we extracted combination treatments involving TNF and concomitant factors including IFNγ and IFNβ in four human cellular models (left-margin). From the interaction axis, we extracted interaction profiles that encode a qualitative change of the effect of TNF mono-treatment. We started by considering three types of qualitative changes: suppression, antagonistic reversal, and synergistic reversal (top-margin). For each dataset and profile, we processed the corresponding gene list with enrichment analysis to gain insight at functional level. The matrix elements correspond to selected significantly enriched functions (hypergeometric test; * adjusted p<0.05, ** adjusted p<0.01, *** adjusted p<0.001). Example hits from each function are shown in parentheses. In the case of emergent effects (last column), we looked for the presence of new functions, not regulated by TNF alone (top right, see also Materials and methods). The results suggest that cofactors could drive the emergence of new TNF functions.

For each pair of dataset and profile, we processed the corresponding gene list with enrichment analysis to gain insight at the functional level. Genes showing suppression and reversal of TNF effects were significantly enriched in important immune processes including Th1 polarization and antigen presentation (Figure 3). Further analysis (Figure 3, top-right, Materials and methods) also suggested that co-modulators can drive the emergence of new functions, not observed by TNF stimulation in isolation. Although these emerging functions were relatively few, they comprised key processes such as proteasome degradation, an alternative NF-kB activation pathway, and T cell chemotaxis.

Our results suggest that TNF biology can be qualitatively altered by other stimuli through a variety of interaction effects including suppression, reversal, and the emergence of entirely new functions. This provides a new insight into the function of a cytokine.

Prediction and validation of TNF and IFNβ interaction effects in human monocyte-derived dendritic cells

Finally, we investigate in-depth the interactions in our newly generated dataset of TNF combined with the cytokine IFNβ (see Materials and methods), and illustrate the hypothesis generation and validation cycle enabled by iSAIL. Although TNF and IFNβ are key modulators of immune function whose individual effects have been extensively studied, their interactions remain poorly understood. We previously reported that TNF and IFNβ act synergistically to induce an antiviral state in monocyte-derived dendritic cells (moDC) (Hartmann et al., 2014). To study the systems level impact of TNF and IFNβ co-treatment on human moDC, we applied iSAIL to this transcriptomic combination treatment experiment.

iSAIL detected 374 positive interactions, which we mapped to the corresponding profile groups. The interaction groups with the largest number of positive interactions were ‘emergent synergy’, ‘TNF potentiates IFNβ’, ‘IFNβ restores TNF’, and ‘IFNβ potentiates TNF’ (Figure 4A). To understand the function of these interaction profiles, we sorted the corresponding gene lists based on the synergy score and searched for candidates with potential key immunological roles.

Prediction and validation of IFNβ and TNF synergistic effects in monocyte-derived dendritic cells .

(A) We applied iSAIL to analyze the synergistic effects induced by IFNβ (3000 pg/mL) and TNF (4500 pg/mL) on monocyte-derived dendritic cells after 1 hr of exposure. We focused on the four profile groups with the largest number of synergies. In the group of emergent synergies, we found VCAM1, a gene involved in the regulation of cell adhesion. (B) We tested the hypothesis that the emergent induction of VCAM1 in the DC would mediate an increase in DC-T cell interaction. Assayed by imaging flow cytometry, moDCs exposed to IFNβ+TNF showed an emergent increase in DC-T cell adhesion. VCAM-1 neutralization abolished this synergistic effect. The right subpanel shows the mean synergy score of the three replicates, plus or minus the standard error of the mean. VCAM-1 neutralization significantly reduced the synergy score (t-test, p=0.03). (C) To explore whether TNF+IFNβ synergy profiles represented coherent gene programs, we determined the functional enrichment for each of the four patterns, shown in the columns. (C, left panel). We compared these iSAIL-based functional analyses to results obtained with a conventional analysis of unclassified synergy genes. (C, right panel). iSAIL-based enrichment provided a richer functional annotation. In particular, it suggested that synergy genes in the ‘IFNβ potentiates TNF’ pattern rightmost panel in (A) may mediate T cell proliferation. (D) This hypothesis was tested using allogeneic cross donor stimulation. The synergy group ‘IFNβ potentiates TNF’ contained two hits potentially responsible for T cell proliferation: TNFSF9 and CCL5. Using CCL-5 neutralizing antibodies, we confirmed that IFNβ and TNF act in synergy to promote T cell proliferation, and that this proliferation depends on CCL-5. The right subpanel shows the mean synergy score of the three replicates, plus or minus the standard error of the mean. CCL-5 neutralization significantly reduced the synergy score (t-test, p=0.008).

Focusing on emergent synergies, we found the largest synergy scores for LIMK2, MCOLN2, SLC7A5, TP53BP2, and several genes having more established immunological roles such as RELB, IL15RA, and VCAM1. The protein VCAM-1 has been described as a regulator of leukocyte migration and cell adhesion (Deem et al., 2007). Due to the fundamental importance of the DC-T cell axis in the generation of an immune response, we hypothesized that TNF and IFNβ synergistically induce VCAM-1 to promote moDC-T cell adhesion. We experimentally tested this hypothesis by quantifying DC-T cell adhesion using imaging flow cytometry (see Materials and methods). When exposed to the combination of TNF and IFNβ, moDC showed an increased adhesion to T cells that was not observed with either cytokine alone (Figure 4B). The increased DC-T cell adhesion was mediated by VCAM-1, since VCAM-1 neutralization abolished the synergistic effect (Figure 4B). To our knowledge, these results identify for the first time a role for synergistic induction of VCAM-1 by TNF and IFNβ in promoting DC-T cell adhesion.

To further explore the immune processes controlled by TNF and IFNβ synergies, we performed enrichment analysis (Figure 4C, left) separately for the different profiles, and compared the results with a conventional analysis that aggregates all the synergies in a single gene set (Figure 4C, right). Annotation by profile increased the number and significance of annotation terms compared to the conventional analysis. Interestingly, we found a reciprocal potentiation of TNF and IFNβ signaling pathways by the combination treatment. Both iSAIL and conventional analysis captured a highly significant enrichment in mineral absorption. However, iSAIL analysis also revealed the pattern ‘IFNβ restores TNF’ as the main contributor to this enrichment. This pattern contains several members of the family of metallothioneins (MT1X, MT1E, MT1F, MT1HL1), which are increasingly recognized as important players in the response to cytokines and pathogen signals (Subramanian Vignesh and Deepe Jr., 2017).

Some annotation terms appeared only in the iSAIL analysis and were not resolved by conventional analysis. In particular, we found the pattern ‘TNF potentiates IFNβ’ enriched in NK cell proliferation, and the pattern ‘IFNβ potentiates TNF’ enriched in T cell proliferation. We studied allogeneic cross-donor stimulation to test the hypothesis suggested by this analysis that a synergistic effect on T cell proliferation resulted from IFNβ and TNF stimulation (Figure 4D, left panel). We observed a nearly twofold increase in the percent of proliferating T cells upon the combination treatment, an effect not seen with either stimulus alone. The synergy pattern of the T cell proliferation measurement diverged slightly from the gene level profile, which showed some effect by TNF alone. It is not surprising that the interactions profiles comparing mRNA level regulation and protein-dependent functional effects show marginal differences. Importantly, the synergistic induction of T cell proliferation, predicted by iSAIL, was validated experimentally.

Next, we wanted to identify the molecular mediators of the increased T cell proliferation. Candidate genes for mediating this synergistic response predicted by enrichment analysis included TNSFS9 and CCL5. Review of the literature suggested CCL-5 as the most likely candidate (Makino et al., 2002). We therefore hypothesized that CCL-5 may contribute to increased proliferation of T cells induced by TNF and INFβ exposed DC. This hypothesis was confirmed by immunoneutralization of CCL-5 (Figure 4D, right panel). These validation experiments demonstrate the value of iSAIL in mining interaction data for new hypotheses leading to biological and mechanistic insight.

Discussion

In this work, we present a comprehensive resource to map and interpret interaction effects from individual and multiple combination treatment studies. Applying our machine learning classification method to the compendium of immunological combination treatment datasets, we identify ~30,000 interactions. We show that the collection of interactions visualized as a landscape can be mined for general insight into the microenvironment-dependent role of key immunomodulators, for specific testable hypotheses about combinatorial control of immune processes.

A potential consequence of interactions is the radical modification of effects observed with individual treatments. Using the iSAIL framework, these events are easily identified by isolating interaction profiles that encode qualitative combination changes in the effect of a treatment of interest. In the case of TNF, we found that co-modulators alter fundamental immunological processes, such as antigen presentation and T helper cell polarization, and may produce the emergence of entirely new functions, not modulated by TNF exposure alone. Identifying the context-dependent effects of an agent may be useful from a therapeutic perspective. Our approach may assist in the design of drug combinations that leverage antagonistic interactions to selectively reduce pathogenic activity while preserving necessary homeostatic effects.

A fundamental tool for biological interpretation of -omics experiments is pathway-level enrichment analysis. iSAIL uncovers biological processes regulated in combination treatment studies by fine-grained aggregation of genes showing similar interaction responses. These aggregates can reflect coherent biological processes, providing insight into the principles and functions of interactions. In the analysis of TNF and IFNβ co-treatment, iSAIL uncovered novel synergies that control the DC-T cell interactions and T cell proliferation, both of which are critical immune processes. Importantly, iSAIL analysis generated hypotheses on the molecular mediators controlling these functions, VCAM-1 and CCL-5, which were validated by additional experiments. The ability to identify new interactions and mechanisms in a relatively well characterized combination such as TNF and IFNβ points to an even greater potential for discovery in less characterized combinations.

In its current version, iSAIL was developed to analyze transcriptomics experiments. Consistent with this, the classifier was trained under the assumption of normally distributed data, which is commonly held in the analysis of microarray and RNA-seq data (upon a suitable transformation). However, under alternative distributional assumptions, our framework can be adapted to any -omics data from combination treatment experiments, including proteomics, metabolomics, and epigenomics.

Understanding the dynamic properties of combinatorial interactions is a very relevant, yet largely unexplored area. Included in our compendium are time-course datasets with only two time points. Due to such poor temporal sampling, these studies were analyzed as separate experiments. Although the dynamics of combinatorial responses is not addressed in our work, the iSAIL framework would still provide a fundamental grammar to map gene trajectories from longer time series. Time-dependent interrelationships could be included in our analysis, for example, by imposing additional constraints that guarantee smooth transitions between interaction profiles at successive time points. Our analysis framework, applied to a large compendium of time course experiments, should they become available, will enable the identification of preferential trajectories and dynamic rules governing time transitions between interaction profiles.

Despite the centrality of combinatorial interactions to the control of the immune system, many important questions remain open. In particular, the ability to predict combinatorial interaction effects based on the individual effects remains elusive (Preuer et al., 2018; Ryall and Tan, 2015; Menden et al., 2019). Although predicting interactions is unrelated to the objectives of our study, iSAIL may still support research in this direction. For example, our global estimates of the frequency of different interactions may be used as prior information to enhance the predictive power of future models.

Finally, insight into combination treatment experiments is relevant to the field of drug therapeutics, as thousands of clinical trials in the United States alone are studying the effects of drug combinations (Schmidt, 2017; Nature Medicine, 2017; Wu et al., 2015). Drug interactions can drive both clinical therapeutic efficacy and treatment side effects. The success of drug combinations relies on the control of pathogenic pathways while preserving the homeostatic pathways. By uncovering relevant interactions and their functions, iSAIL user-friendly online tools can further the understanding and therapeutic applications of interaction mechanisms.

Materials and methods

Definition and simulation of interaction profiles

Request a detailed protocol

The notion of interaction profile was introduced in our previous work (Cappuccio et al., 2015) and is briefly summarized here. Figure 1—figure supplement 1 shows the taxonomy of 123 profiles used in this study. Mathematically, each of these profiles corresponds to a linear system of inequalities satisfied by the mean expression levels of a gene in the conditions 0, X, Y, X+Y. These mean expression levels are respectively denoted by eO¯eX¯eY¯eX+Y¯. The system that defines a given profile admits infinitely many solutions, each of which can be seen as a particular instance of the profile. For example, an emergent synergy (Figure 1—figure supplement 1, profile 3) is satisfied by the vectors (2.2, 2.2, 2.2, 5.5), (4.1, 4.1, 4.1, 7.8), as well as by infinitely many other qualitatively similar vectors.

To simulate a profile, our strategy starts by sampling the solution space of the corresponding system of inequalities. The solution space is sampled within a range of admissible values, calibrated to mimic the typical numerical ranges of -omics data. Next, a noise term is added to each instance of a profile using a random number generator. To account for the relatively small number of samples in -omics data, we simulated four replicates for each of the conditions 0, X, Y, X+Y. The noise term is assumed to be normally distributed. This assumption is widely held in the analysis of microarray data, and still applicable to RNA-seq data upon a suitable transformation (Law et al., 2014). Increasing noise levels correspond to a decreasing effect size, which is defined in terms of the standardized differences between the group means in the four conditions, as further described below.

The steps to simulate interaction profiles are as follows:

  • Definition of a range of admissible expression values. This was chosen as the interval [−14, 14], consistent with the log2-transformed expression values from microarray and RNA-seq upon a Voom transformation (Law et al., 2014).

  • Sampling the solution space for the given profile in the specified range. The vector of numbers (e0¯, eX¯, eY¯, eX+Y¯) for the given profile were found with the function xsample from the package limsolve. For each profile, we extracted 400 instances for each level of noise.

  • Definition of the signal of a simulated profile. The signal can be seen as a generalization of the fold-change in A vs. B experiments. In a combination treatment, we first consider all pairwise fold-changes from the conditions 0, X, Y, X+Y. Except for the case of constant genes, at least one of these contrasts must be different from 0. The signal δ is defined as the absolute value of the smallest non-zero differences. To avoid very weak signals, not meaningful in the analysis of expression data, a minimum signal of 0.5 is used in the simulated data.

  • Simulation of random noise. To simulate random variability around the values e0¯, eX¯, eY¯, eX+Y¯, the data was assumed to be normally distributed around the group means: eiN(ei¯,σ), with i = 0, X, Y, X+Y. The parameter σ was assumed the same for all groups. For each of the four conditions, we simulated n=4 replicates. Different levels of noise, were simulated by setting different values of the ratio δ/σ.

  • Enforcement of the range of the expression values. The addition of random noise can push some of the values ei outside the initially prescribed range of expression. In this case, we forced the simulated values to be at the limit of the range. For example, a value of -18.5 was reset to -14, the lower limit of the prescribed range.

Training and testing of the machine learning classifier

Request a detailed protocol

To train the machine learning classifiers, we generated a training set by simulating multiple instances of every profile in the admissible range of expression values. Each instance of a simulated profile was represented as a vector of statistical features. These include the estimated mean values e0¯, eX¯, eY¯, eX+Y¯, the p-values of all possible pairwise contrasts among these four values, and additional statistics returned by the Limma package (Ritchie et al., 2015) served as predictors of the true class.

The training set comprised different noise regimes. These were simulated by fixing different values for the parameter δ/σ, as described in the previous section. We considered the following values: low noise (δ/σ = 4), medium noise (δ/σ = 2.5), high noise (δ/σ = 2). Training of the machine learning classifiers was done using the R packages Caret and RandomForest.

To select the best model, we generated additional simulated data and measured the out-of-sample classification accuracy per profile and for the same values of δ/σ that were used to build the training set. For each of these values, the accuracy was quantified as the proportion of correct predictions. An advantage of RF and LDA over the deterministic match is the possibility for a ‘soft’ (i.e. probabilistic) classification of an input profile into any element of the taxonomy. The quality of the full probabilistic output of RF and LDA was quantified for all noise levels using the multi-class log gain (Figure 1—figure supplement 2D). To further assess the performance of RF and LDA for the different classes, the distribution of precision and recall over all taxonomy classes was examined (Figure 1—figure supplement 2). All these criteria indicated RF as the most robust classifier. The multiclass log-gain, and the class-specific precision and recall were computed with the R packages MLmetrics and mltest.

Generating the combinatorial landscape of immunity

Request a detailed protocol

Public combination treatment datasets were retrieved from Gene Expression Omnibus using the package GEOquery (Davis and Meltzer, 2007). The relevant datasets were identified using key terms typically associated with combination treatments such as 'synergy', 'antagonism', 'combinatorial', and similar. This approach was designed to facilitate automatic update of the resource as new -omics data from combination treatment experiments become publicly available. To facilitate comparisons, all the datasets were imported in the format of the original publications. The datasets were preprocessed as follows. First, if different probes were available for the same gene, the probe with the largest coefficient of variation was selected. Second, genes with low coefficient of variation (lower than the median across all genes) were filtered out. Next, differentially expressed genes were determined with the Limma package. A significance cutoff of 0.05 was applied on the p-values corrected for multiple testing. An additional cutoff was imposed on the δ (defined above): genes with δ lower than the median value computed across all differentially expressed genes were filtered out. The resulting differentially expressed genes were then analyzed by the machine learning classifier which assigned to each gene the predicted element of the taxonomy. For each identified interaction, a score was defined to measure its magnitude as well as its significance. The magnitude of the interaction b was measured as the average Bliss index, defined as the average deviation from additivity b=ΔeX+Y¯(ΔeX¯+ΔeY¯), where Δei¯=ei¯e0¯ (i = X, Y, X+Y), b>0 for positive interactions and b<0 for negative interactions. The significance of the interaction was measured as the class probability p returned by the classifier. To account for both the significance and the magnitude of an interaction, an overall score was defined as the product bp.

The identified interactions were annotated using a manually curated list of stimulatory and inhibitory immune checkpoints, as well as the gene lists provided by the ImmPort database (Bhattacharya et al., 2014).

Enrichment analysis

Request a detailed protocol

The functional enrichment of interactions was done using the Enrichr library (Kuleshov et al., 2016). Four annotation databases were considered: GO Biological Processes (2017b), KEGG (2016), Wikipathways (2016), Reactome (2016). Enrichment was considered significant if the enrichment p-value adjusted for multiple testing was lower than 0.05.

To analyze the synergies induced by IFNβ and TNF co-treatment, we focused on annotation terms with size lower than 500 genes, to increase the specificity of the identified functions and pathways. Moreover, we imposed a minimum threshold on the overlap between the annotation term and the gene set being analyzed. This threshold was meant to identify annotation terms covering a minimum proportion of the gene set being analyzed. We chose a minimum coverage of 2%.

DC differentiation

Request a detailed protocol

All human subjects research protocols were reviewed and approved by the IRB of the Icahn School of Medicine at Mount Sinai. Monocyte-derived DCs were obtained from healthy human blood donors following a standard protocol described elsewhere (Hartmann et al., 2017). All experiments were replicated using cells obtained from different donors.

Microarray data of human moDC treated with TNF and IFNβ

Request a detailed protocol

DC were treated with 4500 pg/mL TNF, 3000 pg/mL IFNβ, or the combination of both for either 1 hr or 2.5 hr. Untreated DC served as a negative control. Three samples were taken per treatment and time point. RNA was extracted with the RNeasy plus kit (Qiagen) following the manufacturer’s instructions. Gene expression was assayed using broad human genome specific HG-U133_Plus_2 GeneChip expression probe arrays (Affymetrix). Affymetrix microarray data were normalized using gcRMA (Wu and Irizarry, 2004). Additional data processing was done as described above (see in Materials and methods section ‘Generating the combinatorial landscape of immunity’).

Experimental validation of TNF and IFNβ synergistic effects

Request a detailed protocol

To test the involvement of VCAM-1 in mediating TNF and IFNβ induced synergy, DCs were exposed to TNF, IFNβ, the combination of TNF and IFNβ or control as described above. Four hours after treatment DCs were exposed to allogeneic T cells in a 1:3 ratio for an additional 4 hr and then fixed with paraformaldehyde. Cells were stained with fluorochrome labeled antibodies against CD11c (DCs) and CD3 (T cells) and analyzed by imaging flow cytometry. DCs interacting with T cells were identified in images where one or multiple T cells had a direct contact with a DC.

To test the involvement of CCL-5 on TNF and IFNβ induced synergy, DCs were exposed to the cytokine mixtures as described above. After 4 hr, DCs were exposed to CFSE stained allogeneic T cells for 5 days and then fixed with paraformaldehyde. Cells were stained with a monoclonal antibody against CD11c and the extent of T cell proliferation was measured by the dilution of CFSE in the CD11c-negative population, as CFSE gets weaker with every T cell division. The stain to exclude DCs was necessary as DCs also digest CFSE-positive parts of T cells.

Data availability

Gene expression data generated for this study have been deposited in GEO under accession code GSE134209. All other dataset accession codes analysed in this study are included in the manuscript (Table 1). All the analyses have been implemented in R. An interactive R Shiny application of iSAIL can be found at https://isail.shinyapps.io/test_app/. The site also contains downloadable code and documentation to run the software locally.

The following data sets were generated
    1. Hartmann BM
    2. Zaslavsky E
    (2019) NCBI Gene Expression Omnibus
    ID GSE134209. Microarray data of human moDC treated with IFNβ and TNFα.

References

  1. Conference
    1. Wu M
    2. Sirota M
    3. Butte AJ
    4. Chen B
    (2015)
    Characteristics of drug combination therapy in oncology by analyzing clinical trial data on ClinicalTrials.gov
    Pacific Symposium on Biocomputing.

Decision letter

  1. Alfonso Valencia
    Reviewing Editor; Barcelona Supercomputing Center - BSC, Spain
  2. Satyajit Rath
    Senior Editor; Indian Institute of Science Education and Research (IISER), India

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This resource aggregates information on interactions from public immunological

(interleukins, checkpoints, and other immune modulators) data sets to facilitate the analysis of combinations of drug and ligands onto immune cell activation. The system, based on underlying random-forest classification to sort out experimental noise and reveal new properties, provides information at the level of general functional classes (enrichment analysis) associated to estimated error levels, within a user oriented graphical interface.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Deciphering the combinatorial landscape of immunity" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by a Senior Editor. The reviewers have opted to remain anonymous.

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife.

The paper describes an method to study interactions in a large immunological molecular system of ~30,000 interactions: The main hypothesis derived from the system is related with the immunological effects of TNFα on IFNβ.

There are a number of problems with the current submission. First, the proposed method has not been systematically validated beyond the arguments in this paper. The example chosen corresponds to a well-known system, e.g. TNF cytokines synergies, for which it is unclear how much the paper adds. In any case, to have a significant impact in biology a more challenging hypothesis , going beyond the already-known synergies of TNF, should be proposed and ideally, it would be followed by additional experiments. Only then the potential of the system will be demonstrated as actually able to uncover new biology.

While based on this criticism the paper will not be acceptable for publication, it might be considered if organized as a resource paper (A Tools and Resources article allows authors to publish the details of new experimental techniques, datasets, software tools, and other resources). It will be also possible to consider a future version of this work including a complementary validation of the method and experimental follow-up of the proposed models.

Reviewer #1:

The authors have developed an interesting resource with 30k interactions extracted from public immunological datasets (25 human and 7 murine combination treatment studies).

On this data set they apply a aggrupation and training strategy (random forest algorithm operating on what are called "logic principles" of the interactions) to differentiate single responses from combine ones (synergies) in a set of interleukins, checkpoints, and other immune modulators. The system works at a semi-qualitative level that takes into account the fluctuations and variability of the data, and provides an interpretation at the level of general functional classes (enrichment analysis).

The potentially more interesting innovation of the model is the systematic introduction of error levels. The most valuable asset is the collection of interactions from public data. The proposed machine learning technology, the biological analysis and the grouping of genes is also rather standard and simple. The visualization is nice and help to understand the results.

The key problem is that the method is not validated in any systematic way. The proposed experimental follow-up might be interesting from a biological point of view but cannot be considered systematic validation. It might be a nice tool for exploration but the information presented is not enough to consider it as a prediction system.

Other points are related with the way the methodology is presented, that is rather obscure when in practice it is very simple, i.e. functional inference is enrichment analysis, machine learning interaction classification is a random forest classifier, omic analysis is expression data, etc. While the main idea of combining and averaging results is not explained clearly.

My opinion is that this paper could be considered for publication if it can be presented as a resource for the exploration of immunological interactions based on the collected data, gene aggregation strategy and visualisation system. In this case, it would have to be written in a different way, clearly oriented to use of the system and providing the biological results as examples of use.

Reviewer #2:

In this article, Cappuccio et al. introduce a computational method to better assess the impact of combinations of drug and ligands onto immune cell activation. This is a problem of fundamental significance, both from the theoretical side (how to tackle the combinatorial complexity of immunological interactions?) and from the practical side (how to leverage this combinatorial complexity to maximize clinical perturbation?). There exist many experimental datasets that have attempted at reporting the transcriptional responses of cells under single and dual perturbations, yet, there does not exist a systematic framework to integrate, classify and leverage these datasets. This publication introduces a Synergistic/Antagonistic Interaction Learner (iSAIL) to tackle this issue and to publicize the method.

One of the key premises of this computational framework is that individual measurement can lead to inaccurate classification of molecular perturbations, because of limited signal/noise and/or saturation. However, when leveraging the large number of datasets already available, one can improve on this computational task and get a more accurate classification. Hence, the key novelty in this manuscript is presented in Figure 2 and 3 where the statistical framework is introduced. Again, the sobering experimental fact is that individual experiments (e.g. dataset acquired for a given cell type and/or a limited set of perturbation) can be misleading classifying how two molecular perturbations can combine. The computational framework proposed here use random-forest classification to sort out this experimental noise and reveal new properties.

Figure 4 and 5 present one example related to responses to IFN-b and TNF to highlight an emergent function of these cytokines as highly synergistic depending on the molecular/cellular context. One could question whether this method scales up to higher level of combinations of perturbations: given the exponential explosion when combining N drugs, how do the computational framework introduced here help in designing higher order of immunological perturbation?

Overall, this is a serious/robust study addressing an important issue in pharmacology and immunology. Additional details about the method would help the reader better grasp the novelty of the computational method but release of the iSAIL method on a website is making this criticism less stringent: experimentalists will be able to test the method directly and assess its usefulness. More specific comments/suggestions are listed below.

Detailed comments:

Results first paragraph: What is defined here as synergy and antagonism is not congruent with the classical definitions of these terms. One example of this is profile 4 in Figure 1—figure supplement 1, which shows that individual treatments inhibit expression of a particular gene, and that the combination of treatments enhances that inhibition. Classically, this would be considered synergism in the ability of inhibiting expression of that gene, but in your classification this is considered as antagonism, because the final expression levels are lower than the expected by additivity. In your previous paper (Cappuccio et al., 2015), your classification was in agreement with previous definitions, and this interaction was classified as "negative synergy". Unless you provide support and clear explanation for changing this nomenclature, I would recommend going back to the nomenclature used in Cappuccio et al., 2015, namely calling them "positive and negative interactions", rather than "synergistic and antagonistic interactions". Additionally, for cases were the combination interactions were not as dramatic as the few examples showed in the figures, or for profiles called "ceiling" or "floor", it is not possible to determine synergism or antagonism in the absence of dose-response curves obtained by -omics experiments (Chou et al., 2006). The absence of dose-response information would also be problematic for genes which regulation is not monotonic as a function of dose (Senthivel et al., 2016).

Referring to "emergent responses (see Figure 1—figure supplement 1 , profile 3)", make effect bigger in Figure 1—figure supplement 1 so it is clearer to see the differences between profile 3 and 7.

Referring to "would be aggregated with nominal synergistic effects due to frequent cellular or assay saturation, which have low biological significance (see Figure 1—figure supplement 1 , profile 7)”. Please explain better the text before the comma. I think in profile 7 you are referring to the "floor" profile, where background in the assay or basal levels of gene expression makes it look like synergistic, and in your current explanation you are referring to the "ceiling" profile, where saturation is the problem. You can write something similar as it appeared later in the paper regarding these interactions that are likely not meaningful: "represent range limitations of the assay or biological responses"

Multiple instances:

Please replace TNF-α with simply TNF. The former term is obsolete (there is no TNF-β…) and has been retired.

Figure 1: The panels for gene 2 and gene N could be permuted to maintain the hierarchy shown in the second column (synergistic > additive > antagonistic). The graphical rendition of drug interactions as slanted panel in the second column is hard to interpret: what do the authors want to highlight there?

Figure 2D: stimulat -> stimulate

Figure 2E: Specify which IFN.

Figure 5 legend: Include concentrations of IFN-β and TNF (these could be normalized to EC50 to make it easier to read and interpret -is the experiment done at saturation -ceiling-, at subdetectable levels -floor- or in linear regime ;around EC50)

Label the columns in panel c

Reviewer #3:

This study utilizes machine learning to examine gene expression datasets of immune stimulation and attempts to derive a universally applicable set of canonical responses. The basic approach dates back to the work of Janes et al. (Science 2005), which, incidentally, is not cited. My main concern centers on the need for a relatively large amount of data combined with synthetic data in order to obtain insights focused on well-studied interactions of TNF with other cytokines. Furthermore, I am unclear as to the true universal applicability of this approach, for the following reasons:

1) The authors mention that the datasets used involved multiple time points, but it is unclear how dynamics are being handled and modeled across these datasets. One would fully expect that in any extensive time course study of two stimuli, multiple outcomes (e.g. potentiation followed by tolerance or stimulation followed by suppression) would be observed. How can the algorithm then decompose such phenomena into a single characteristic response (i.e., does the algorithm treat each time point as a separate experiment, thereby losing true time-dependent interrelationships?)

2) The system chosen for validation, namely TNF + IFN-β, does not seem particularly novel. It has been appreciated for decades that TNF synergizes with other cytokines to produce an array of effects, including potentiation and tolerance as examples of preconditioning. A demonstration of a truly unexpected set of interactions between stimuli would have been more convincing.

3) The additional validation studies focused on VCAM1 also seem to be focused on relatively well-studied phenomena. Multiple other gene transcripts that appear more novel are not followed up.

https://doi.org/10.7554/eLife.62148.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Reviewer #1:

The authors have developed an interesting resource with 30k interactions extracted from public immunological datasets ( 25 human and 7 murine combination treatment studies).

On this data set they apply a aggrupation and training strategy (random forest algorithm operating on what are called "logic principles" of the interactions) to differentiate single responses from combine ones (synergies) in a set of interleukins, checkpoints, and other immune modulators. The system works at a semi-qualitative level that takes into account the fluctuations and variability of the data, and provides an interpretation at the level of general functional classes (enrichment analysis).

The potentially more interesting innovation of the model is the systematic introduction of error levels. The most valuable asset is the collection of interactions from public data. The proposed machine learning technology, the biological analysis and the grouping of genes is also rather standard and simple. The visualization is nice and help to understand the results.

The key problem is that the method is not validated in any systematic way. The proposed experimental follow-up might be interesting from a biological point of view but cannot be considered systematic validation. It might be a nice tool for exploration but the information presented is not enough to consider it as a prediction system.

When analyzing combinatorial treatments datasets, there are two major challenges. One is classifying and interpreting the presence of interactions. The second is predicting the occurrence of interactions from single treatment data. From this reviewer’s critique, we suspect that these two different goals (classifying and interpreting vs predicting interactions) have not been clearly distinguished in introducing the problem setting, and this may have caused a lack of clarity regarding the main contributions of our work. In this context, we clarify both the objective and validation of our framework below.

Objective of the method: Our method is not a “prediction system” i.e. our objective is not to predict, denovo, the occurrence of interactions from individual responses. The goal of our method is to identify, using a statistically rigorous approach, to classify and to interpret interactions. Rigorous identification of interaction effects is not a trivial matter. When placed in context, the value of this advance, which is recognized by this reviewer, becomes evident, the rigorous interaction identification component of our work is equivalent to how the study of individual gene fold-changes in transcriptomics datasets only advanced after the systematic introduction of error levels.

Systematic Validation: Consistent with our main objectives of detecting and interpreting interactions, we provide validation of our framework at two levels: (1) by testing the performance of our method on simulated data, where the classification of interactions can be assessed in the presence of the ground truth, and (2) through validation experiments confirming the correct biological interpretation of new synergies between TNF and IFNb.

Finally, while we do not aim to directly predict interactions, we show that our framework does provide new insight on general rules that govern combinatorial interactions. As we state in the discussion, these general rules can be used in future work aiming to predict interactions, but this possibility is tangential to the main research objective of the present manuscript. We acknowledge that our results on the probabilistic algebra may have generated confusion, as that section is thematically close to predictive modelling. Thus, in this revised, resource-oriented version, we have removed this section from the Results.

To clarify these points, we have extensively revised the Introduction and the Discussion sections.

Other points are related with the way the methodology is presented, that is rather obscure when in practice it is very simple, i.e. functional inference is enrichment analysis, machine learning interaction classification is a random forest classifier, omic analysis is expression data, etc. While the main idea of combining and averaging results is not explained clearly.

We appreciate the reviewer’s suggestions and have revised the manuscript to streamline the description of the method and improve the presentation of aggregating gene responses for more fine-grained enrichment analysis.

My opinion is that this paper could be considered for publication if it can be presented as a resource for the exploration of immunological interactions based on the collected data, gene aggregation strategy and visualisation system. In this case, it would have to be written in a different way, clearly oriented to use of the system and providing the biological results as examples of use.

We appreciate the reviewer’s suggestion and are now submitting a revised version of our manuscript for consideration as a Tools and Resources article. In this revised version we present our work as a collection of tools and datasets for the exploration of immunological interactions. The biological results are now presented as a set of use-cases of our system.

Reviewer #2:

In this article, Cappuccio et al. introduce a computational method to better assess the impact of combinations of drug and ligands onto immune cell activation. This is a problem of fundamental significance, both from the theoretical side (how to tackle the combinatorial complexity of immunological interactions?) and from the practical side (how to leverage this combinatorial complexity to maximize clinical perturbation?). There exist many experimental datasets that have attempted at reporting the transcriptional responses of cells under single and dual perturbations, yet, there does not exist a systematic framework to integrate, classify and leverage these datasets. This publication introduces a Synergistic/Antagonistic Interaction Learner (iSAIL) to tackle this issue and to publicize the method.

One of the key premises of this computational framework is that individual measurement can lead to inaccurate classification of molecular perturbations, because of limited signal/noise and/or saturation. However, when leveraging the large number of datasets already available, one can improve on this computational task and get a more accurate classification. Hence, the key novelty in this manuscript is presented in Figure 2 and 3 where the statistical framework is introduced. Again, the sobering experimental fact is that individual experiments (e.g. dataset acquired for a given cell type and/or a limited set of perturbation) can be misleading classifying how two molecular perturbations can combine. The computational framework proposed here use random-forest classification to sort out this experimental noise and reveal new properties.

Figure 4 and 5 present one example related to responses to IFN-b and TNF to highlight an emergent function of these cytokines as highly synergistic depending on the molecular/cellular context. One could question whether this method scales up to higher level of combinations of perturbations: given the exponential explosion when combining N drugs, how do the computational framework introduced here help in designing higher order of immunological perturbation?

Overall, this is a serious/robust study addressing an important issue in pharmacology and immunology. Additional details about the method would help the reader better grasp the novelty of the computational method but release of the iSAIL method on a website is making this criticism less stringent: experimentalists will be able to test the method directly and assess its usefulness. More specific comments/suggestions are listed below.

We appreciate the reviewer’s positive evaluation of our study. The reviewer notes both the key methodological novelty of our framework, as well as the extensive validation studies.

Regarding scaling up iSAIL to combinations of more than two stimuli, we would like to emphasize the fact that our approach is not intrinsically limited to pairwise combinations. In principle, the main steps of our framework (e.g., combinatorial analysis, generation of simulated data, and training of a classifier) could be implemented to handle the case of more than two stimuli. However, the main limitation is due to data scarcity. Our search of public data repositories (datasets deposited before 2019) showed that immunological combinatorial datasets with more than two stimuli are extremely rare, precluding systematic research in this direction. Although higher order perturbation combinations are an important aspect of research into combinatorial treatments, due to a complete lack of appropriately structured datasets and to the need for further extensive development, we consider it beyond the current scope of our work. To acknowledge the relevance of the higher order combinations, we have now addressed this point in the Discussion.

Detailed comments:

Results first paragraph: What is defined here as synergy and antagonism is not congruent with the classical definitions of these terms. One example of this is profile 4 in Figure 1—figure supplement 1, which shows that individual treatments inhibit expression of a particular gene, and that the combination of treatments enhances that inhibition. Classically, this would be considered synergism in the ability of inhibiting expression of that gene, but in your classification this is considered as antagonism, because the final expression levels are lower than the expected by additivity. In your previous paper (Cappuccio et al., 2015), your classification was in agreement with previous definitions, and this interaction was classified as "negative synergy". Unless you provide support and clear explanation for changing this nomenclature, I would recommend going back to the nomenclature used in Cappuccio et al., 2015, namely calling them "positive and negative interactions", rather than "synergistic and antagonistic interactions". Additionally, for cases were the combination interactions were not as dramatic as the few examples showed in the figures, or for profiles called "ceiling" or "floor", it is not possible to determine synergism or antagonism in the absence of dose-response curves obtained by -omics experiments (Chou et al., 2006). The absence of dose-response information would also be problematic for genes which regulation is not monotonic as a function of dose (Senthivel et al., 2016).

As pointed out by the reviewer, we modified the nomenclature presented in our previous work. One of the main changes, in particular, was to replace the terms of “positive” and “negative” interactions with the terms “synergistic” and “antagonistic” interactions, respectively. However, we would like to point out that, while the nomenclature has been revised, the precise definition of “synergistic” (“positive”) interactions and of “antagonistic” (“negative”) interactions are consistently applied in our previous and current works. In both cases, “synergistic” (positive) interactions comprise super-additive effects, while “antagonistic” (negative) interactions are sub-additive effects. To illustrate this point, we refer to the profiles 4, and 6 in Figure 1—figure supplement 1. From the qualitative point of view, the two profiles represent the same response pattern of a gene in the condition 0, X, Y, X+Y. That is, both profile 4 and 6 express the same ranking of the expression values in the four conditions. Such ranking is compatible both with a positive (synergistic) and with a negative (antagonistic) interaction. However, what makes profile 4 and 6 different is the comparison between the expression value of X+Y and the expected level of additivity. In profile 4 (antagonistic, or negative interaction), the expression value is lower than the expected additivity; in profile 6 (synergistic, or positive interaction), the expression value is larger than the expected additivity.

Due to the reviewer’s comment, we acknowledge that the change of the terms “positive” and “negative” interactions with “synergistic” and “antagonistic” interactions may generate confusion. To address this issue, we accept the reviewer’s suggestion to revert back to our original nomenclature, namely calling "positive and negative interactions", rather than "synergistic and antagonistic interactions".

Regarding the impossibility to determine synergism or antagonism in the absence of dose-response curves obtained, we agree with the reviewer. These effects are problematic and difficult to interpret but, at the same time, very frequent. Unless otherwise specified by the user, iSAIL automatically excludes these ceiling and floor effects from downstream analysis.

Referring to "emergent responses (see Figure 1—figure supplement 1 , profile 3)", make effect bigger in Figure 1—figure supplement 1 so it is clearer to see the differences between profile 3 and 7.

This has been revised.

Referring to "would be aggregated with nominal synergistic effects due to frequent cellular or assay saturation, which have low biological significance (see Figure 1—figure supplement 1 , profile 7)”. Please explain better the text before the comma. I think in profile 7 you are referring to the "floor" profile, where background in the assay or basal levels of gene expression makes it look like synergistic, and in your current explanation you are referring to the "ceiling" profile, where saturation is the problem. You can write something similar as it appeared later in the paper regarding these interactions that are likely not meaningful: "represent range limitations of the assay or biological responses"

This has been revised.

Multiple instances:

Please replace TNF-α with simply TNF. The former term is obsolete (there is no TNF-β…) and has been retired.

This has been revised.

Figure 1: The panels for gene 2 and gene N could be permuted to maintain the hierarchy shown in the second column (synergistic > additive > antagonistic). The graphical rendition of drug interactions as slanted panel in the second column is hard to interpret: what do the authors want to highlight there?

We now clarify this in the figure legend. We accept the reviewer’s permutation suggestion and have revised this figure. The slanting in column 2 is to indicate that the slanted pattern is one of an entire deck of synergistic or antagonistic patterns classified by iSAIL.

Figure 2D: stimulat -> stimulate

This has been revised.

Figure 2E: Specify which IFN.

This has been revised.

Figure 5 legend: Include concentrations of IFN-β and TNF (these could be normalized to EC50 to make it easier to read and interpret -is the experiment done at saturation -ceiling-, at subdetectable levels -floor- or in linear regime; around EC50)

The concentrations have now been included.

Label the columns in panel c

This has been revised.

Reviewer #3:

This study utilizes machine learning to examine gene expression datasets of immune stimulation and attempts to derive a universally applicable set of canonical responses. The basic approach dates back to the work of Janes et al. (Science 2005), which, incidentally, is not cited. My main concern centers on the need for a relatively large amount of data combined with synthetic data in order to obtain insights focused on well-studied interactions of TNF with other cytokines. Furthermore, I am unclear as to the true universal applicability of this approach, for the following reasons:

When analyzing combinatorial treatments datasets, there are two major challenges. One is classifying and interpreting the presence of interactions. The second is predicting the occurrence of interactions from single treatment data. From this reviewer’s citing of the Janes et al. study, we suspect that these two different goals (classifying and interpreting vs predicting interactions) have not been clearly distinguished in introducing the problem setting, and this may have caused a lack of clarity regarding the main contributions of our work. The Janes study was directed at predicting apoptotic responses from detailed signaling assays. In contrast, our study addressed a completely different problem: systematically detecting, classifying and interpreting interaction effects in a dataset and integrating insight across many studies. In this context, we clarify both the objective of the framework and derived biological insight below.

Objective of the iSAIL method: iSAIL is not a “prediction system”, i.e. our objective is not to predict, denovo, the occurrence of interactions. The goal of our method is to robustly identify, classify and interpret interactions. This work is a crucial advance, a statistically rigorous identification of interaction effects has been lacking, which has limited the insight gained from combination treatment experiments using heretofore available tools. Furthermore, the method is not predicated on the availability of many datasets (as possibly implied by the reviewer). To clarify, we generate a synthetic dataset to train a machine learning classifier – the only scenario where the ground truth classification of interactions is known. Once the model parameters are optimized, any individual dataset constructed according to the design of a standard combinatorial treatment experiment, including control, single and combination treatment conditions, can be analyzed using iSAIL.

Biological insight: The reviewer’s main concern is that the insights gained by iSAIL involve a well-characterized system, such as TNF in combination with other cytokines. Our validation system was not meant to confirm the generic prediction that TNF synergizes with another cytokine (e.g. IFN-β), as is well known in the literature. Rather, the validation aimed to show our ability to correctly infer the biological effects and mechanisms associated with such interactions. The validation experiments confirmed the validity of our inferences of biological mechanisms predicated on VCAM1 and CCL5 which are, to the best of our knowledge, original and previously unpublished results.

From this perspective, the ability of iSAIL to identify new biological effects and corresponding mechanisms of interactions in a relatively well characterized system can be seen -different from the reviewer's interpretation- as a proof-of-concept that our method holds an even greater potential for discovery in less characterized systems of signal combinations.

Finally, while we do not aim to directly predict interactions in individual datasets, we show that our framework does provide new insight on general rules that govern combinatorial interactions. As we state in the Discussion, these general rules can be used in future work aiming to predict interactions, but this possibility is tangential to the main research objective of our work. We acknowledge that our results on the probabilistic algebra may have generated confusion, as that section is thematically close to predictive modelling. Thus, in this revised, resource-oriented version, we have removed this section from the Results.

1) The authors mention that the datasets used involved multiple time points, but it is unclear how dynamics are being handled and modeled across these datasets. One would fully expect that in any extensive time course study of two stimuli, multiple outcomes (e.g. potentiation followed by tolerance or stimulation followed by suppression) would be observed. How can the algorithm then decompose such phenomena into a single characteristic response (i.e., does the algorithm treat each time point as a separate experiment, thereby losing true time-dependent interrelationships?)

We agree with the reviewer that the dynamics of signal integration is a very relevant, yet largely unexplored area. Extensive search of public data repositories (datasets deposited before 2019) showed that immunological combinatorial datasets with time course sampling are rare, precluding systematic research in this direction. Included in our compendium are time-course datasets with only two time points. Due to such poor temporal sampling, these studies were analyzed as separate experiments. While independent analysis of each time point would not be optimal for longer time series, developing a method to model temporal dynamics in combinatorial treatment experiments is impractical because of paucity of data.

Although the dynamics of combinatorial responses is not addressed in our work, our new analysis framework still provides a fundamental grammar and set of tools to map gene trajectories from longer time series. Time-dependent interrelationships could be included in our analysis, for example, by imposing additional constraints that guarantee smooth transitions between interaction profiles at successive time points. Our analysis framework, applied to a large compendium of time course experiments, should they become available, will enable us to identify preferential trajectories and dynamic rules governing time transitions between interaction profiles, similar to the probabilistic algebra we have identified in the static case.

Although the dynamics is a key aspect of combinatorial treatments, due to a complete lack of appropriately structured datasets and to the need for further extensive development, we consider it beyond the current scope of our work. To acknowledge the relevance of the dynamics, we have added a new paragraph to the Discussion.

2) The system chosen for validation, namely TNF + IFN-β, does not seem particularly novel. It has been appreciated for decades that TNF synergizes with other cytokines to produce an array of effects, including potentiation and tolerance as examples of preconditioning. A demonstration of a truly unexpected set of interactions between stimuli would have been more convincing.

As clarified in the response to the reviewer (under General assessment and major comments section), Our validation system was not meant to confirm the generic prediction that TNF synergizes with other cytokines (e.g. IFN-β), as is well known in the literature, including in our publications. Rather, the validation aimed to show our ability to correctly infer the biological mechanisms of such interactions. For more on this point, please refer to our answer above.

3) The additional validation studies focused on VCAM1 also seem to be focused on relatively well-studied phenomena. Multiple other gene transcripts that appear more novel are not followed up.

Our validation studies reveal important biological effects and mechanisms mediated by synergistic interactions between TNF and IFN β. To the best of our knowledge, these results are original and unpublished. The choice of VCAM1 and CCL5 as validation targets was driven by a combination of data-driven discovery, as provided by iSAIL, and prior knowledge. In our opinion, using prior knowledge, such as previously published literature and well-annotated functional pathway libraries at the validation step does not diminish the originality of our results. For example, while VCAM1 is generically known as a mediator of cell adhesion, no study has previously shown in a data-driven way that this gene is synergistically induced by TNF and IFNb, and validated the hypothesis that it controls the DC / T cell crosstalk.

When applied to a combinatorial treatment experiment of interest, iSAIL returns a set of transcripts displaying the most relevant interaction effects. These transcripts typically include both well- and poorly- characterized genes, both of which can be further explored in validation studies. The choice of gene candidates for further experiments depends on the context and underlying research objectives, and is ultimately the prerogative of the experimenter. In our immunological model, we focused on the discovery of new mechanisms regulating the crosstalk between DC and T cells, which is vastly recognized as one the most critical steps in the generation of an immune response.

References:

Mack, Chris A. "How to write a good scientific paper: acronyms." Journal of Micro/Nanolithography, MEMS, and MOEMS 11, no. 4 (2012): 040102.

Senthivel, Vivek Raj, et al. "Identifying ultrasensitive HGF dose-response functions in a 3D mammalian system for synthetic morphogenesis." Scientific reports 6.1 (2016): 1-15.

https://doi.org/10.7554/eLife.62148.sa2

Article and author information

Author details

  1. Antonio Cappuccio

    1. Institut Curie, Integrative Biology of Human Dendritic Cells and T Cells Laboratory, PSL Research University, Inserm, U932, Paris, France
    2. Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, United States
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
  2. Shane T Jensen

    Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia, United States
    Contribution
    Formal analysis, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Boris M Hartmann

    Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Validation, Methodology, Writing - original draft, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5649-6776
  4. Stuart C Sealfon

    Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Writing - original draft, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
  5. Vassili Soumelis

    1. Institut Curie, Integrative Biology of Human Dendritic Cells and T Cells Laboratory, PSL Research University, Inserm, U932, Paris, France
    2. Laboratoire d'immunologie, biologie et histocompatibilité, AP-HP, Hôpital Saint-Louis, Paris, France
    Contribution
    Conceptualization, Supervision, Funding acquisition, Writing - review and editing
    For correspondence
    vassili.soumelis@aphp.fr
    Competing interests
    No competing interests declared
  6. Elena Zaslavsky

    Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Validation, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    elena.zaslavsky@mssm.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4828-7771

Funding

National Institute of Allergy and Infectious Diseases (5U19AI117873)

  • Stuart C Sealfon

Defense Advanced Research Projects Agency (N6600119C4022)

  • Stuart C Sealfon

European Research Council (IT-DC 281987)

  • Vassili Soumelis

Agence Nationale de la Recherche (ANR-11-LABX-0043 CIC IGR-Curie 1428)

  • Vassili Soumelis

European Research Council (POC DrugSynergy 680890)

  • Vassili Soumelis

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by the US National Institutes of Health (NIH) grant 5U19AI117873, by the Defense Advanced Research Projects Agency (DARPA) contract number N6600119C4022, by the European Research Council under Grant IT-DC 281987, by Agence Nationale de la Recherche under Grant ANR-11-LABX-0043 CIC IGR-Curie 1428, and by ERC 2015 POC DrugSynergy 680890.

Senior Editor

  1. Satyajit Rath, Indian Institute of Science Education and Research (IISER), India

Reviewing Editor

  1. Alfonso Valencia, Barcelona Supercomputing Center - BSC, Spain

Publication history

  1. Received: August 15, 2020
  2. Accepted: November 5, 2020
  3. Accepted Manuscript published: November 23, 2020 (version 1)
  4. Version of Record published: December 18, 2020 (version 2)

Copyright

© 2020, Cappuccio et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 878
    Page views
  • 140
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    2. Epidemiology and Global Health
    Olivier Thomine et al.
    Research Article

    Simulating nationwide realistic individual movements with a detailed geographical structure can help optimize public health policies. However, existing tools have limited resolution or can only account for a limited number of agents. We introduce Epidemap, a new framework that can capture the daily movement of more than 60 million people in a country at a building-level resolution in a realistic and computationally efficient way. By applying it to the case of an infectious disease spreading in France, we uncover hitherto neglected effects, such as the emergence of two distinct peaks in the daily number of cases or the importance of local density in the timing of arrival of the epidemic. Finally, we show that the importance of super-spreading events strongly varies over time.

    1. Chromosomes and Gene Expression
    2. Computational and Systems Biology
    Zelin Liu et al.
    Tools and Resources

    Circular RNAs (circRNAs) act through multiple mechanisms via their sequence features to fine-tune gene expression networks. Due to overlapping sequences with linear cognates, identifying internal sequences of circRNAs remains a challenge, which hinders a comprehensive understanding of circRNA functions and mechanisms. Here, based on rolling circular reverse transcription (RCRT) and nanopore sequencing, we developed circFL-seq, a full-length circRNA sequencing method, to profile circRNA at the isoform level. With a customized computational pipeline to directly identify full-length sequences from rolling circular reads, we reconstructed 77,606 high-quality circRNAs from seven human cell lines and two human tissues. circFL-seq benefits from rolling circles and long-read sequencing, and the results showed more than tenfold enrichment of circRNA reads and advantages for both detection and quantification at the isoform level compared to those for short-read RNA sequencing. The concordance of the RT-qPCR and circFL-seq results for the identification of differential alternative splicing suggested wide application prospects for functional studies of internal variants in circRNAs. Moreover, the detection of fusion circRNAs at the omics scale may further expand the application of circFL-seq. Together, the accurate identification and quantification of full-length circRNAs make circFL-seq a potential tool for large-scale screening of functional circRNAs.