Deciphering the combinatorial landscape of immunity
Abstract
From cellular activation to drug combinations, immunological responses are shaped by the action of multiple stimuli. Synergistic and antagonistic interactions between stimuli play major roles in shaping immune processes. To understand combinatorial regulation, we present the immune Synergistic/Antagonistic Interaction Learner (iSAIL). iSAIL includes a machine learning classifier to map and interpret interactions, a curated compendium of immunological combination treatment datasets, and their global integration into a landscape of ~30,000 interactions. The landscape is mined to reveal combinatorial control of interleukins, checkpoints, and other immune modulators. The resource helps elucidate the modulation of a stimulus by interactions with other cofactors, showing that TNF has strikingly different effects depending on co-stimulators. We discover new functional synergies between TNF and IFNβ controlling dendritic cell-T cell crosstalk. Analysis of laboratory or public combination treatment studies with this user-friendly web-based resource will help resolve the complex role of interaction effects on immune processes.
Introduction
Immunological processes are regulated by a broad diversity of stimuli, such as growth factors, cytokines, integrins, and pathogen components. Studies of immune regulation have largely focused on individual stimuli, elucidating transcriptional and functional programs induced by major immune modulators such as TNF (Locksley et al., 2001; Kalliolias and Ivashkiv, 2016; Bouwmeester et al., 2004), interferons (Pestka, 2007; Schneider et al., 2014; González-Navajas et al., 2012), and Toll-like Receptor (TLR) ligands (Kawasaki and Kawai, 2014; Vidya et al., 2018). However, immunological responses are constantly shaped by the combined action of multiple stimuli. As a fundamental example, T helper (Th) cell activation requires a combination of antigen presentation and costimulatory molecules (Smith-Garvin et al., 2009). Due to combinatorial effects, the study of individual stimuli alone may provide a misleading oversimplification of their role in complex microenvironments. Developing a systematic understanding of how combinations of stimuli control the immune system is an unmet challenge, with broad biological and therapeutic implications.
What makes studying immunology from a combinatorial perspective essential is that nonlinear interactions often make the effects of combined stimuli dramatically different from merely additive effects seen with each stimulus alone. Immunological studies have discovered that interaction effects play major roles in shaping immune system processes (Noubade et al., 2014; Min et al., 2012; Tuvim et al., 2012; Walasek et al., 2012; Hartmann et al., 2014; Teles et al., 2013). However, the complexity of these effects and the lack of resolution of current combinatorial experiment analysis methods are a barrier to understanding how interactions regulate immunological functions. The standard approach to this problem is to identify genes showing positive (synergistic) or negative (antagonistic) deviations in the combined treatment condition from the sum of individual effects. This restricted binary classification has critical limitations. First, it does not distinguish between biologically relevant and artifactual interactions that depend on the specific assay (Cappuccio et al., 2015). Second, it does not allow for meaningful aggregation of interactions to delineate coherent functional programs. Finally, this standard approach does not support integration and shared interpretation of interactions from multiple combination treatments. These limitations obscure general properties and functions of interactions between immunological stimuli.
To overcome this barrier, we develop a framework based on combinatorial analysis and machine learning. Using combinatorial analysis, we derive a comprehensive taxonomy of all interaction patterns that can occur in a combination treatment experiment. Next, we train a machine learning classifier able to robustly map noisy -omics data across the pre-defined taxonomy. As we demonstrate, this framework now resolves the most informative synergistic and antagonistic effects. Annotation of genes assigned to each discrete regulatory pattern identifies the regulated immunological functions. In addition, assignment of responses from multiple combinatorial datasets to a standardized taxonomy enables global integration to systematically elucidate the interactions that shape immune functions.
The resulting immune Synergistic and Antagonistic Interaction Learner (iSAIL) resource includes the machine learning classifier, a curated compendium of immunological combination treatment datasets and a global dataset integration to produce a landscape of ~30,000 interactions from diverse immune cell types and combinations of stimuli. We show how the landscape can be mined to reveal new insight into the prevalence of different types of interactions, and into their immunological functions. Further, we show that the resource facilitates the discovery of new combinatorial regulatory processes. iSAIL analysis of data on TNF and IFNβ co-treatment led us to discover new synergistic mechanisms implicated in dendritic cell-T cell crosstalk.
iSAIL is available as a user-friendly web-accessible resource to help elucidate the combinatorial complexity of immunity.
Results
Background and motivation
The prototypical combination treatment experiment comprises -omics data obtained from the following conditions: a vehicle control (denoted by 0), two individual stimuli (X, Y), and their combination (X+Y), Figure 1A, left subpanel. The goal of this experiment is to discover cellular processes regulated by positive (synergistic) and negative (antagonistic) interactions. This requires a fine-grained analysis able to distinguish between biologically relevant and artifactual interactions, and able to aggregate similar interactions to delineate coherent functional programs. The standard approach, which classifies interactions as positive or negative, does not do this adequately, leading to a loss of important biological information. For example, biologically meaningful positive interactions such as emergent responses (e.g. an increase or decrease with the combination in the absence of any effect in the individual treatment conditions), would not be distinguished from nominal interactions with low biological significance related to saturation effects. Furthermore, the aggregation of all the positive (or negative) interactions obscures the inference of coherent biological processes as typically obtained through pathway enrichment analysis, because this crude aggregation confounds very diverse interaction effects.
Machine learning framework to map interaction effects from -omics data from combination treatment studies
To improve the extraction of biological insight from a combination experiment, we develop a machine learning framework within the immune Synergistic and Antagonistic Interaction Learner (iSAIL) resource. The framework increases the resolution of a standard approach by using an abstract, pre-defined taxonomy of 123 distinguishable response profiles that can theoretically occur in a combination treatment experiment (Figure 1—figure supplement 1). These profiles represent qualitatively different scenarios for the expression of a gene in a combination treatment experiment. To analyze data on combination treatment, iSAIL applies a machine learning classifier trained to robustly classify the experimental gene response across the taxonomy (Figure 1A, middle subpanel). Interaction classification makes it possible to resolve the most meaningful interactions (e.g. emergent responses), while automatically excluding from downstream analysis all the nominal interactions (e.g. saturation effects) that carry minimal biological information. Importantly, genes classified in the same interaction profile are likely to share a coherent functional program. These coherent gene groups are then interrogated for mechanistic insight via pathway enrichment analysis (Figure 1A, right subpanel). By breaking down the combinatorial effects into these functional units, we greatly improve the functional interpretation of biological processes regulated by interactions.
A key problem we address is to robustly classify gene response from noisy -omics data across. To solve this problem, the classifier generates a probabilistic assignment of the response for each gene to the taxonomy-defined profiles. This solution is more robust to noise than the deterministic matching previously used for related problems (Cappuccio et al., 2015). We systematically validate our approach by evaluating the classification performance on simulated data. Furthermore, in order to select the optimal framework, we investigate multiple machine learning models and compare their probabilistic output using the synthetic data under different noise regimes. These analyses identify Random Forest as a robust method with high predictive accuracy across all noise regimes (Figure 1—figure supplements 2 and 3).
A resource to analyze user-generated or public immunological combination treatment datasets
The taxonomy of interaction profiles used by the learning framework provides a standardized abstract space, independent of any specific dataset, where interactions from multiple datasets can be mapped and integrated for global analysis. Based on immunological relevance and data quality (see Materials and methods), we select a compendium of 25 human and 7 murine combination treatment studies for inclusion in the resource (Table 1). These datasets include studies of innate immune cells including several dendritic cell populations, macrophages, and monocytes, as well as epithelial cells and adaptive immune cells. Stimuli include diverse combinations of cytokines, Toll-Like Receptor (TLR) ligands, and drugs. We apply our classification framework to each dataset from the compendium and agglomerate the results in an overall interaction landscape, including it as a component of the resource (Figure 1B).
The implemented interaction classification methodology, along with the interaction landscape, is made available in a user-friendly platform. The platform allows researchers to upload and analyze interactions from newly generated data, or to mine the interaction landscape, derived from our compendium, for global insight. To facilitate the functional interpretation of interactions, the platform is integrated with external resources including ImmPort (Bhattacharya et al., 2014), Gene Cards (Stelzer et al., 2016), and enrichR (Kuleshov et al., 2016), that provide extensive single-gene and pathway level annotations. The iSAIL resource has the capacity to generate insight into combinatorial immunity and help guide hypothesis generation and further experimentation.
Exploring and visualizing the interaction landscape
To build the interaction landscape, we systematically applied our interaction classification framework (Figure 2A) to the data compendium, mapping a total of 29,479 interactions. The number of interactions vary widely with the different cell types and combinations of stimuli (Figure 2—figure supplement 1). The landscape, extensively annotated using external databases, is an information-rich object that can be mined to gain new insight on general properties of interactions and on the immunological functions they regulate.
For example, mining the landscape enables a systematic quantification of the prevalence of interactions that may occur with combination treatments, which is currently unknown. We find that the most frequent interactions likely represent range limitations of the assay or biological responses, which we refer to as floor and ceiling effects (Figure 2B, top subpanel). These effects provide limited biological information in the absence of dose-response curves (Chou, 2006), unavailable in our context. Our approach segregates floor and ceiling effects from less frequent but more biologically relevant interactions, including suppression (9%), inhibition (8%), restoration (4%), emergence (4%), and potentiation (3%) (Figure 2B, middle subpanel). Our analysis also reveals that several theoretically possible combinatorial effects were nearly absent (<0.04%). These rare profiles include reversals, where two signals with the same individual effect (e.g. upregulation of a gene separately by X and Y) are reversed by the combination (e.g. downregulation of the same gene by X+Y) (Figure 2B, bottom subpanel).
To explore the immunological functions regulated by interactions, we visualize the landscape using an interactive 3D representation. The three axes represent datasets, genes, and scores quantifying the intensity and robustness of the identified interactions (Figure 2C). Each axis is then annotated with additional information. The dataset axis is annotated with metadata on the experiments including species, cell type, stimuli, and time point. The gene axis is annotated with immunological gene families such as chemokines, interleukins, checkpoints and other terms from the ImmPort database (Bhattacharya et al., 2014). The interaction axis is annotated by the taxonomy profiles assigned to each interaction by the machine learning classifier. Slicing the landscape along specific dimensions provides different types of insight into immunological interactions.
For example, slicing by gene family allows systematic identification of positive and negative interactions involving immune modulators of interest. Figure 2D shows a 2D projection of the landscape involving stimulatory and inhibitory checkpoints for a selection of datasets including macrophage, moDC, pDC, and T cells. Genes in these families show a variable propensity toward synergistic and antagonistic regulation across datasets. While CD40 and CD80 presented sparse, selective interaction effects across datasets, IDO1 shows positive interactions in 5 out of the 15 selected datasets. The profiles of these synergies contain two cases of potentiation from different experimental models, one involving pDC stimulated with IL3 and influenza virus (Flu), and the other involving moDC stimulated with TNF and IFNβ (Figure 2E). Because IDO1 inhibits T cell division and promotes regulatory T cells (van Baren and Van den Eynde, 2015), these positive interactions may serve to restrict overreacting immune responses in the inflammatory microenvironment.
Our results show that mining the interaction landscape can reveal new insight on combinatorial interactions and their immunological functions.
Interactions determine a context-dependent TNF biology
To further illustrate how the interaction landscape can be analyzed, we study how TNF biology can be altered by the presence of cofactors. To this end, we sliced the combinatorial landscape along two axes (Figure 3, top-left). From the dataset axis, we selected combination treatments involving TNF with other stimuli, including IFNγ and IFNβ in four human cell models (Figure 3, left-margin). From the interaction axis, we extracted interaction profiles that encoded a qualitative change of the TNF effect when considered as a mono-treatment. We started by considering three types of qualitative changes: suppression, antagonistic reversal, and synergistic reversal of TNF effects (Figure 3, top-margin).
For each pair of dataset and profile, we processed the corresponding gene list with enrichment analysis to gain insight at the functional level. Genes showing suppression and reversal of TNF effects were significantly enriched in important immune processes including Th1 polarization and antigen presentation (Figure 3). Further analysis (Figure 3, top-right, Materials and methods) also suggested that co-modulators can drive the emergence of new functions, not observed by TNF stimulation in isolation. Although these emerging functions were relatively few, they comprised key processes such as proteasome degradation, an alternative NF-kB activation pathway, and T cell chemotaxis.
Our results suggest that TNF biology can be qualitatively altered by other stimuli through a variety of interaction effects including suppression, reversal, and the emergence of entirely new functions. This provides a new insight into the function of a cytokine.
Prediction and validation of TNF and IFNβ interaction effects in human monocyte-derived dendritic cells
Finally, we investigate in-depth the interactions in our newly generated dataset of TNF combined with the cytokine IFNβ (see Materials and methods), and illustrate the hypothesis generation and validation cycle enabled by iSAIL. Although TNF and IFNβ are key modulators of immune function whose individual effects have been extensively studied, their interactions remain poorly understood. We previously reported that TNF and IFNβ act synergistically to induce an antiviral state in monocyte-derived dendritic cells (moDC) (Hartmann et al., 2014). To study the systems level impact of TNF and IFNβ co-treatment on human moDC, we applied iSAIL to this transcriptomic combination treatment experiment.
iSAIL detected 374 positive interactions, which we mapped to the corresponding profile groups. The interaction groups with the largest number of positive interactions were ‘emergent synergy’, ‘TNF potentiates IFNβ’, ‘IFNβ restores TNF’, and ‘IFNβ potentiates TNF’ (Figure 4A). To understand the function of these interaction profiles, we sorted the corresponding gene lists based on the synergy score and searched for candidates with potential key immunological roles.
Focusing on emergent synergies, we found the largest synergy scores for LIMK2, MCOLN2, SLC7A5, TP53BP2, and several genes having more established immunological roles such as RELB, IL15RA, and VCAM1. The protein VCAM-1 has been described as a regulator of leukocyte migration and cell adhesion (Deem et al., 2007). Due to the fundamental importance of the DC-T cell axis in the generation of an immune response, we hypothesized that TNF and IFNβ synergistically induce VCAM-1 to promote moDC-T cell adhesion. We experimentally tested this hypothesis by quantifying DC-T cell adhesion using imaging flow cytometry (see Materials and methods). When exposed to the combination of TNF and IFNβ, moDC showed an increased adhesion to T cells that was not observed with either cytokine alone (Figure 4B). The increased DC-T cell adhesion was mediated by VCAM-1, since VCAM-1 neutralization abolished the synergistic effect (Figure 4B). To our knowledge, these results identify for the first time a role for synergistic induction of VCAM-1 by TNF and IFNβ in promoting DC-T cell adhesion.
To further explore the immune processes controlled by TNF and IFNβ synergies, we performed enrichment analysis (Figure 4C, left) separately for the different profiles, and compared the results with a conventional analysis that aggregates all the synergies in a single gene set (Figure 4C, right). Annotation by profile increased the number and significance of annotation terms compared to the conventional analysis. Interestingly, we found a reciprocal potentiation of TNF and IFNβ signaling pathways by the combination treatment. Both iSAIL and conventional analysis captured a highly significant enrichment in mineral absorption. However, iSAIL analysis also revealed the pattern ‘IFNβ restores TNF’ as the main contributor to this enrichment. This pattern contains several members of the family of metallothioneins (MT1X, MT1E, MT1F, MT1HL1), which are increasingly recognized as important players in the response to cytokines and pathogen signals (Subramanian Vignesh and Deepe Jr., 2017).
Some annotation terms appeared only in the iSAIL analysis and were not resolved by conventional analysis. In particular, we found the pattern ‘TNF potentiates IFNβ’ enriched in NK cell proliferation, and the pattern ‘IFNβ potentiates TNF’ enriched in T cell proliferation. We studied allogeneic cross-donor stimulation to test the hypothesis suggested by this analysis that a synergistic effect on T cell proliferation resulted from IFNβ and TNF stimulation (Figure 4D, left panel). We observed a nearly twofold increase in the percent of proliferating T cells upon the combination treatment, an effect not seen with either stimulus alone. The synergy pattern of the T cell proliferation measurement diverged slightly from the gene level profile, which showed some effect by TNF alone. It is not surprising that the interactions profiles comparing mRNA level regulation and protein-dependent functional effects show marginal differences. Importantly, the synergistic induction of T cell proliferation, predicted by iSAIL, was validated experimentally.
Next, we wanted to identify the molecular mediators of the increased T cell proliferation. Candidate genes for mediating this synergistic response predicted by enrichment analysis included TNSFS9 and CCL5. Review of the literature suggested CCL-5 as the most likely candidate (Makino et al., 2002). We therefore hypothesized that CCL-5 may contribute to increased proliferation of T cells induced by TNF and INFβ exposed DC. This hypothesis was confirmed by immunoneutralization of CCL-5 (Figure 4D, right panel). These validation experiments demonstrate the value of iSAIL in mining interaction data for new hypotheses leading to biological and mechanistic insight.
Discussion
In this work, we present a comprehensive resource to map and interpret interaction effects from individual and multiple combination treatment studies. Applying our machine learning classification method to the compendium of immunological combination treatment datasets, we identify ~30,000 interactions. We show that the collection of interactions visualized as a landscape can be mined for general insight into the microenvironment-dependent role of key immunomodulators, for specific testable hypotheses about combinatorial control of immune processes.
A potential consequence of interactions is the radical modification of effects observed with individual treatments. Using the iSAIL framework, these events are easily identified by isolating interaction profiles that encode qualitative combination changes in the effect of a treatment of interest. In the case of TNF, we found that co-modulators alter fundamental immunological processes, such as antigen presentation and T helper cell polarization, and may produce the emergence of entirely new functions, not modulated by TNF exposure alone. Identifying the context-dependent effects of an agent may be useful from a therapeutic perspective. Our approach may assist in the design of drug combinations that leverage antagonistic interactions to selectively reduce pathogenic activity while preserving necessary homeostatic effects.
A fundamental tool for biological interpretation of -omics experiments is pathway-level enrichment analysis. iSAIL uncovers biological processes regulated in combination treatment studies by fine-grained aggregation of genes showing similar interaction responses. These aggregates can reflect coherent biological processes, providing insight into the principles and functions of interactions. In the analysis of TNF and IFNβ co-treatment, iSAIL uncovered novel synergies that control the DC-T cell interactions and T cell proliferation, both of which are critical immune processes. Importantly, iSAIL analysis generated hypotheses on the molecular mediators controlling these functions, VCAM-1 and CCL-5, which were validated by additional experiments. The ability to identify new interactions and mechanisms in a relatively well characterized combination such as TNF and IFNβ points to an even greater potential for discovery in less characterized combinations.
In its current version, iSAIL was developed to analyze transcriptomics experiments. Consistent with this, the classifier was trained under the assumption of normally distributed data, which is commonly held in the analysis of microarray and RNA-seq data (upon a suitable transformation). However, under alternative distributional assumptions, our framework can be adapted to any -omics data from combination treatment experiments, including proteomics, metabolomics, and epigenomics.
Understanding the dynamic properties of combinatorial interactions is a very relevant, yet largely unexplored area. Included in our compendium are time-course datasets with only two time points. Due to such poor temporal sampling, these studies were analyzed as separate experiments. Although the dynamics of combinatorial responses is not addressed in our work, the iSAIL framework would still provide a fundamental grammar to map gene trajectories from longer time series. Time-dependent interrelationships could be included in our analysis, for example, by imposing additional constraints that guarantee smooth transitions between interaction profiles at successive time points. Our analysis framework, applied to a large compendium of time course experiments, should they become available, will enable the identification of preferential trajectories and dynamic rules governing time transitions between interaction profiles.
Despite the centrality of combinatorial interactions to the control of the immune system, many important questions remain open. In particular, the ability to predict combinatorial interaction effects based on the individual effects remains elusive (Preuer et al., 2018; Ryall and Tan, 2015; Menden et al., 2019). Although predicting interactions is unrelated to the objectives of our study, iSAIL may still support research in this direction. For example, our global estimates of the frequency of different interactions may be used as prior information to enhance the predictive power of future models.
Finally, insight into combination treatment experiments is relevant to the field of drug therapeutics, as thousands of clinical trials in the United States alone are studying the effects of drug combinations (Schmidt, 2017; Nature Medicine, 2017; Wu et al., 2015). Drug interactions can drive both clinical therapeutic efficacy and treatment side effects. The success of drug combinations relies on the control of pathogenic pathways while preserving the homeostatic pathways. By uncovering relevant interactions and their functions, iSAIL user-friendly online tools can further the understanding and therapeutic applications of interaction mechanisms.
Materials and methods
Definition and simulation of interaction profiles
Request a detailed protocolThe notion of interaction profile was introduced in our previous work (Cappuccio et al., 2015) and is briefly summarized here. Figure 1—figure supplement 1 shows the taxonomy of 123 profiles used in this study. Mathematically, each of these profiles corresponds to a linear system of inequalities satisfied by the mean expression levels of a gene in the conditions 0, X, Y, X+Y. These mean expression levels are respectively denoted by , , , . The system that defines a given profile admits infinitely many solutions, each of which can be seen as a particular instance of the profile. For example, an emergent synergy (Figure 1—figure supplement 1, profile 3) is satisfied by the vectors (2.2, 2.2, 2.2, 5.5), (4.1, 4.1, 4.1, 7.8), as well as by infinitely many other qualitatively similar vectors.
To simulate a profile, our strategy starts by sampling the solution space of the corresponding system of inequalities. The solution space is sampled within a range of admissible values, calibrated to mimic the typical numerical ranges of -omics data. Next, a noise term is added to each instance of a profile using a random number generator. To account for the relatively small number of samples in -omics data, we simulated four replicates for each of the conditions 0, X, Y, X+Y. The noise term is assumed to be normally distributed. This assumption is widely held in the analysis of microarray data, and still applicable to RNA-seq data upon a suitable transformation (Law et al., 2014). Increasing noise levels correspond to a decreasing effect size, which is defined in terms of the standardized differences between the group means in the four conditions, as further described below.
The steps to simulate interaction profiles are as follows:
Definition of a range of admissible expression values. This was chosen as the interval [−14, 14], consistent with the log2-transformed expression values from microarray and RNA-seq upon a Voom transformation (Law et al., 2014).
Sampling the solution space for the given profile in the specified range. The vector of numbers (, , , ) for the given profile were found with the function xsample from the package limsolve. For each profile, we extracted 400 instances for each level of noise.
Definition of the signal of a simulated profile. The signal can be seen as a generalization of the fold-change in A vs. B experiments. In a combination treatment, we first consider all pairwise fold-changes from the conditions 0, X, Y, X+Y. Except for the case of constant genes, at least one of these contrasts must be different from 0. The signal δ is defined as the absolute value of the smallest non-zero differences. To avoid very weak signals, not meaningful in the analysis of expression data, a minimum signal of 0.5 is used in the simulated data.
Simulation of random noise. To simulate random variability around the values , , , , the data was assumed to be normally distributed around the group means: , with i = 0, X, Y, X+Y. The parameter was assumed the same for all groups. For each of the four conditions, we simulated n=4 replicates. Different levels of noise, were simulated by setting different values of the ratio .
Enforcement of the range of the expression values. The addition of random noise can push some of the values outside the initially prescribed range of expression. In this case, we forced the simulated values to be at the limit of the range. For example, a value of -18.5 was reset to -14, the lower limit of the prescribed range.
Training and testing of the machine learning classifier
Request a detailed protocolTo train the machine learning classifiers, we generated a training set by simulating multiple instances of every profile in the admissible range of expression values. Each instance of a simulated profile was represented as a vector of statistical features. These include the estimated mean values , , , , the p-values of all possible pairwise contrasts among these four values, and additional statistics returned by the Limma package (Ritchie et al., 2015) served as predictors of the true class.
The training set comprised different noise regimes. These were simulated by fixing different values for the parameter , as described in the previous section. We considered the following values: low noise ( = 4), medium noise ( = 2.5), high noise ( = 2). Training of the machine learning classifiers was done using the R packages Caret and RandomForest.
To select the best model, we generated additional simulated data and measured the out-of-sample classification accuracy per profile and for the same values of that were used to build the training set. For each of these values, the accuracy was quantified as the proportion of correct predictions. An advantage of RF and LDA over the deterministic match is the possibility for a ‘soft’ (i.e. probabilistic) classification of an input profile into any element of the taxonomy. The quality of the full probabilistic output of RF and LDA was quantified for all noise levels using the multi-class log gain (Figure 1—figure supplement 2D). To further assess the performance of RF and LDA for the different classes, the distribution of precision and recall over all taxonomy classes was examined (Figure 1—figure supplement 2). All these criteria indicated RF as the most robust classifier. The multiclass log-gain, and the class-specific precision and recall were computed with the R packages MLmetrics and mltest.
Generating the combinatorial landscape of immunity
Request a detailed protocolPublic combination treatment datasets were retrieved from Gene Expression Omnibus using the package GEOquery (Davis and Meltzer, 2007). The relevant datasets were identified using key terms typically associated with combination treatments such as 'synergy', 'antagonism', 'combinatorial', and similar. This approach was designed to facilitate automatic update of the resource as new -omics data from combination treatment experiments become publicly available. To facilitate comparisons, all the datasets were imported in the format of the original publications. The datasets were preprocessed as follows. First, if different probes were available for the same gene, the probe with the largest coefficient of variation was selected. Second, genes with low coefficient of variation (lower than the median across all genes) were filtered out. Next, differentially expressed genes were determined with the Limma package. A significance cutoff of 0.05 was applied on the p-values corrected for multiple testing. An additional cutoff was imposed on the δ (defined above): genes with δ lower than the median value computed across all differentially expressed genes were filtered out. The resulting differentially expressed genes were then analyzed by the machine learning classifier which assigned to each gene the predicted element of the taxonomy. For each identified interaction, a score was defined to measure its magnitude as well as its significance. The magnitude of the interaction was measured as the average Bliss index, defined as the average deviation from additivity , where (i = X, Y, X+Y), for positive interactions and for negative interactions. The significance of the interaction was measured as the class probability returned by the classifier. To account for both the significance and the magnitude of an interaction, an overall score was defined as the product
The identified interactions were annotated using a manually curated list of stimulatory and inhibitory immune checkpoints, as well as the gene lists provided by the ImmPort database (Bhattacharya et al., 2014).
Enrichment analysis
Request a detailed protocolThe functional enrichment of interactions was done using the Enrichr library (Kuleshov et al., 2016). Four annotation databases were considered: GO Biological Processes (2017b), KEGG (2016), Wikipathways (2016), Reactome (2016). Enrichment was considered significant if the enrichment p-value adjusted for multiple testing was lower than 0.05.
To analyze the synergies induced by IFNβ and TNF co-treatment, we focused on annotation terms with size lower than 500 genes, to increase the specificity of the identified functions and pathways. Moreover, we imposed a minimum threshold on the overlap between the annotation term and the gene set being analyzed. This threshold was meant to identify annotation terms covering a minimum proportion of the gene set being analyzed. We chose a minimum coverage of 2%.
DC differentiation
Request a detailed protocolAll human subjects research protocols were reviewed and approved by the IRB of the Icahn School of Medicine at Mount Sinai. Monocyte-derived DCs were obtained from healthy human blood donors following a standard protocol described elsewhere (Hartmann et al., 2017). All experiments were replicated using cells obtained from different donors.
Microarray data of human moDC treated with TNF and IFNβ
Request a detailed protocolDC were treated with 4500 pg/mL TNF, 3000 pg/mL IFNβ, or the combination of both for either 1 hr or 2.5 hr. Untreated DC served as a negative control. Three samples were taken per treatment and time point. RNA was extracted with the RNeasy plus kit (Qiagen) following the manufacturer’s instructions. Gene expression was assayed using broad human genome specific HG-U133_Plus_2 GeneChip expression probe arrays (Affymetrix). Affymetrix microarray data were normalized using gcRMA (Wu and Irizarry, 2004). Additional data processing was done as described above (see in Materials and methods section ‘Generating the combinatorial landscape of immunity’).
Experimental validation of TNF and IFNβ synergistic effects
Request a detailed protocolTo test the involvement of VCAM-1 in mediating TNF and IFNβ induced synergy, DCs were exposed to TNF, IFNβ, the combination of TNF and IFNβ or control as described above. Four hours after treatment DCs were exposed to allogeneic T cells in a 1:3 ratio for an additional 4 hr and then fixed with paraformaldehyde. Cells were stained with fluorochrome labeled antibodies against CD11c (DCs) and CD3 (T cells) and analyzed by imaging flow cytometry. DCs interacting with T cells were identified in images where one or multiple T cells had a direct contact with a DC.
To test the involvement of CCL-5 on TNF and IFNβ induced synergy, DCs were exposed to the cytokine mixtures as described above. After 4 hr, DCs were exposed to CFSE stained allogeneic T cells for 5 days and then fixed with paraformaldehyde. Cells were stained with a monoclonal antibody against CD11c and the extent of T cell proliferation was measured by the dilution of CFSE in the CD11c-negative population, as CFSE gets weaker with every T cell division. The stain to exclude DCs was necessary as DCs also digest CFSE-positive parts of T cells.
Data availability
Gene expression data generated for this study have been deposited in GEO under accession code GSE134209. All other dataset accession codes analysed in this study are included in the manuscript (Table 1). All the analyses have been implemented in R. An interactive R Shiny application of iSAIL can be found at https://isail.shinyapps.io/test_app/. The site also contains downloadable code and documentation to run the software locally.
-
NCBI Gene Expression OmnibusID GSE134209. Microarray data of human moDC treated with IFNβ and TNFα.
References
-
ImmPort: disseminating data to the public for the future of immunologyImmunologic Research 58:234–239.https://doi.org/10.1007/s12026-014-8516-1
-
Combinatorial code governing cellular responses to complex stimuliNature Communications 6:6847.https://doi.org/10.1038/ncomms7847
-
VCAM-1 activation of endothelial cell protein tyrosine phosphatase 1BThe Journal of Immunology 178:3865–3873.https://doi.org/10.4049/jimmunol.178.6.3865
-
Immunomodulatory functions of type I interferonsNature Reviews Immunology 12:125–135.https://doi.org/10.1038/nri3133
-
TNF biology, pathogenic mechanisms and emerging therapeutic strategiesNature Reviews Rheumatology 12:49–62.https://doi.org/10.1038/nrrheum.2015.169
-
Toll-like receptor signaling pathwaysFrontiers in Immunology 5:461.https://doi.org/10.3389/fimmu.2014.00461
-
Enrichr: a comprehensive gene set enrichment analysis web server 2016 updateNucleic Acids Research 44:W90–W97.https://doi.org/10.1093/nar/gkw377
-
Impaired T cell function in RANTES-deficient miceClinical Immunology 102:302–309.https://doi.org/10.1006/clim.2001.5178
-
Synergism between curdlan and GM-CSF confers a strong inflammatory signature to dendritic cellsThe Journal of Immunology 188:1789–1798.https://doi.org/10.4049/jimmunol.1101755
-
The interferons: 50 years after their discovery, there is much more to learnJournal of Biological Chemistry 282:20047–20051.https://doi.org/10.1074/jbc.R700004200
-
Interferon-stimulated genes: a complex web of host defensesAnnual Review of Immunology 32:513–545.https://doi.org/10.1146/annurev-immunol-032713-120231
-
T cell activationAnnual Review of Immunology 27:591–619.https://doi.org/10.1146/annurev.immunol.021908.132706
-
The GeneCards suite: from gene data mining to disease genome sequence analysesCurrent Protocols in Bioinformatics 54:5.https://doi.org/10.1002/cpbi.5
-
Metallothioneins: emerging modulators in immunity and infectionInternational Journal of Molecular Sciences 18:2197.https://doi.org/10.3390/ijms18102197
-
Tryptophan-degrading enzymes in tumoral immune resistanceFrontiers in Immunology 6:34.https://doi.org/10.3389/fimmu.2015.00034
-
Toll-like receptors: significance, ligands, signaling pathways, and functions in mammalsInternational Reviews of Immunology 37:20–36.https://doi.org/10.1080/08830185.2017.1380200
-
ConferenceCharacteristics of drug combination therapy in oncology by analyzing clinical trial data on ClinicalTrials.govPacific Symposium on Biocomputing.
-
Preprocessing of oligonucleotide array dataNature Biotechnology 22:656–658.https://doi.org/10.1038/nbt0604-656b
Article and author information
Author details
Funding
National Institute of Allergy and Infectious Diseases (5U19AI117873)
- Stuart C Sealfon
Defense Advanced Research Projects Agency (N6600119C4022)
- Stuart C Sealfon
European Research Council (IT-DC 281987)
- Vassili Soumelis
Agence Nationale de la Recherche (ANR-11-LABX-0043 CIC IGR-Curie 1428)
- Vassili Soumelis
European Research Council (POC DrugSynergy 680890)
- Vassili Soumelis
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This work was supported by the US National Institutes of Health (NIH) grant 5U19AI117873, by the Defense Advanced Research Projects Agency (DARPA) contract number N6600119C4022, by the European Research Council under Grant IT-DC 281987, by Agence Nationale de la Recherche under Grant ANR-11-LABX-0043 CIC IGR-Curie 1428, and by ERC 2015 POC DrugSynergy 680890.
Copyright
© 2020, Cappuccio et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,472
- views
-
- 191
- downloads
-
- 8
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Microbiology and Infectious Disease
Timely and effective use of antimicrobial drugs can improve patient outcomes, as well as help safeguard against resistance development. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently routinely used in clinical diagnostics for rapid species identification. Mining additional data from said spectra in the form of antimicrobial resistance (AMR) profiles is, therefore, highly promising. Such AMR profiles could serve as a drop-in solution for drastically improving treatment efficiency, effectiveness, and costs. This study endeavors to develop the first machine learning models capable of predicting AMR profiles for the whole repertoire of species and drugs encountered in clinical microbiology. The resulting models can be interpreted as drug recommender systems for infectious diseases. We find that our dual-branch method delivers considerably higher performance compared to previous approaches. In addition, experiments show that the models can be efficiently fine-tuned to data from other clinical laboratories. MALDI-TOF-based AMR recommender systems can, hence, greatly extend the value of MALDI-TOF MS for clinical diagnostics. All code supporting this study is distributed on PyPI and is packaged at https://github.com/gdewael/maldi-nn.
-
- Computational and Systems Biology
- Genetics and Genomics
Enhancers and promoters are classically considered to be bound by a small set of transcription factors (TFs) in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with often no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected five distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.