Human diseases arise in a complex ecosystem composed of disease mechanisms and the whole-body state. However, the precise nature of the whole-body state and its relations with disease remain obscure. Here we map similarities among clinical parameters in normal physiological settings, including a large collection of metabolic, hemodynamic, and immune parameters, and then use the mapping to dissect phenotypic states. We find that the whole-body state is faithfully represented by a quantitative two-dimensional model. One component of the whole-body state represents ‘metabolic syndrome’ (MetS) – a conventional way to determine the cardiometabolic state. The second component is decoupled from the classical MetS, suggesting a novel ‘non-classical MetS’ that is characterized by dozens of parameters, including dysregulated lipoprotein parameters (e.g. low free cholesterol in small high-density lipoproteins) and attenuated cytokine responses of immune cells to ex vivo stimulations. Both components are associated with disease, but differ in their particular associations, thus opening new avenues for improved personalized diagnosis and treatment. These results provide a practical paradigm to describe whole-body states and to dissect complex disease within the ecosystem of the human body.
In humans, there is a large variability in physiological and molecular parameters, which make up specific baseline phenotypic states. These states differ substantially between individuals, whether measured in healthy individuals or in patients. Even individuals within a well-defined clinical subgroup, such as a group of obese individuals or a group of patients with the same complex disease, demonstrate substantial inter-individual variation in their metabolic, immunological, and physiological phenotypes. Such differences in states often have important clinical implications. For example, specific phenotypic states such as high serum lipids, increased waist circumference, and high blood pressure are associated with a variety of complex diseases and with the severity of disease (Mørkedal et al., 2011; Cameron and Zimmet, 2008). Heterogeneous phenotypic states can be a barrier to successful therapeutic interventions, since treatments are usually the same for a certain disease (‘one-size-fits-all’ treatments) (Mach et al., 2018; Ridker et al., 2012). Gaining an understanding of phenotypic diversity within relatively homogeneous populations of individuals, and developing frameworks to reliably describe the different phenotypic states, is therefore of fundamental interest and could lead to development of more effective personalized approaches to diagnosis and treatment.
It is generally believed that each individual has a ‘whole-body state’ in which immune, metabolic, and hemodynamic parameters are coordinately regulated (Brodin and Davis, 2017; Price et al., 2017; Trachana et al., 2018). Current attempts to describe these states typically rely on comparisons between healthy and disease groups (Bouhaddou et al., 2019; Hwang et al., 2019; Huang, 2009). They were thus of limited usefulness to describe the overall phenotypic state in any given individual. For instance, a conventional way to describe the whole-body state in the context of cardiometabolic disease is ‘metabolic syndrome’ (MetS), which is defined as a small cluster of metabolic and hemodynamic factors that are associated with cardiovascular disease (Huang, 2009); as the concept of MetS is focused on cardiometabolic disease, it is not necessarily relevant to other types of disease. Consequently, to date, little is known about the most basic aspects of the whole-body state: its nature and structure in healthy humans, its variation between individuals, and its association with disease have not been well characterized.
Here we provide a framework to describe the whole-body state in individuals, characterize the molecular and functional properties of the whole-body state, and describe the potential applications of the state as a marker of disease. We start with the construction of a comprehensive map of immunological, hemodynamic, and metabolic parameters based on the natural diversity in healthy subjects. The resulting map of parameters (in short, a ‘map’) is consistent and reproducible in two healthy human cohorts of different age and body mass index (BMI) ranges, and in separate analyses of males and females. We show how prototypical co-regulation patterns in this map are used to construct a robust quantitative model for the whole-body state. The model consists of two components that together represent the wide diversity in healthy subjects – one component (‘IM1’) explains an average of 25% of the inter-individual variation and the second component (‘IM2’) explains additional 25% (on average) of the variation. Thus, only two components are sufficient to describe the whole-body state, shedding light on a longstanding goal of describing and understanding phenotypic states. Whereas IM1 resembles the conventional definition of MetS, we highlight IM2 as a non-classical type of MetS, driven by another set of clinical parameters. Compared to the conventional characterization of the whole-body state by MetS, we observe that the two-component model presents enhanced associations with disease, thus providing an opportunity to improve personalized diagnosis and treatment. We also show that a small number of measured parameters is sufficient for an accurate assessment of the whole-body state, making this a practical model for future applications.
In this study, we used two cohorts of healthy individuals with different characteristics and distinct origins. One cohort consists of healthy young adults with BMIs in the normal range (75% aged 18–40 with BMI 18–25 kg/m2; n = 473; data from the 500-FG Project Ter Horst et al., 2016). For the other cohort, we sampled older adults who are either overweight or obese (aged 53–78 with BMI >27 kg/m2; n = 126) and are generally healthy (e.g., did not have diabetes and carotid atherosclerosis) (Ter Horst et al., 2020). For simplicity, we refer to these two healthy cohorts as ‘normal BMI’ and ‘obesity’ cohorts, respectively. Subjects in both cohorts are of Western-European (Dutch) background, and both cohorts are part of the Human Functional Genomics Project (http://www.humanfunctionalgenomics.org/site/). Characterization of these cohorts can be found in the Materials and methods and supplementary material (Supplementary file 1A, B and Figure 1—figure supplement 1A).
We performed extensive phenotyping of these individuals with clinical and physiological parameters (in short, ‘parameters’) relating to four main immunometabolic categories: immune, hemodynamic, metabolism parameters, and lipoprotein parameters (a total of 292 and 259 parameters in the obesity and normal-BMI cohorts, respectively; Supplementary file 1C, D). The immunological parameters include the immune composition of blood and the cytokine production capacity of isolated peripheral blood mononuclear cells (PBMCs) in response to ex vivo stimulation with microbial products. The hemodynamic parameters include cardiovascular parameters such as heart rate, as well as hematological parameters such as hematocrit. The metabolic parameters include metabolites in blood and markers of body fat distribution. Finally, the data consists of 168 lipoprotein parameters, which include the composition of lipids within various subfractions of circulating lipoproteins. The validity of these data sets is demonstrated in Figure 1—figure supplements 1 and 2. Throughout this study, we used the relative measured levels of each parameter – that is, for each parameter, its measured levels were centered and scaled across the entire cohort (for simplicity, relative levels are also referred to as ‘measured levels’).
These two cohorts provide an attractive opportunity for a comprehensive evaluation of reproducibility due to their high overlap of clinical parameters (Supplementary file 1C). As more parameters were available for the obesity cohort than for the normal BMI cohort, we first demonstrate our main findings in the obesity cohort and then evaluate the generality of these findings through additional analyses of the normal-BMI cohort.
The map of parameters was constructed in a two-step approach (Materials and methods). In the first step, we applied data transformation in order to amplify signals of similarity among clinical parameters. The transformation is guided by lipoproteins: if two clinical parameters are correlated (after the transformation), it means that they have similar interrelations with lipoproteins. The transformation indeed amplifies the correlations among parameters (Figure 1—figure supplement 3A–C). The second step relies on this amplification to define a robust structure of clinical and physiological parameters. Specifically, the structure was inferred from the transformed data using dimensionality reduction (principal component analysis). The result of this two-step procedure is a low-dimensional space (called a ‘map’) in which each clinical and physiological parameter is embedded in a certain position within the space.
Applying this construction on the obesity cohort yielded a two-dimensional map that could describe a large part of the variability (80%) in clinical and physiological parameters. The two dimensions of the map are referred to as ‘axis IM1’ and ‘axis IM2’ (Figure 1A and Supplementary file 2A). In this map, the various parameters fall in a circular pattern along a nearly continuous spectrum. The circular organization of the map reflects a relatively simple structure of interrelations among parameters: proximal parameters are positively correlated and parameters that are located in opposite sides of the map are negatively correlated (Figure 1B, Figure 1—figure supplement 1B–D, and Supplementary file 2B). For instance, as demonstrated in Figure 1B, the proximal glucose and triglycerides parameters are positively correlated (23% correlation of measured levels), and two parameters located in opposing sides of the map – adiponectin and liver fat – are negatively correlated (43% anti-correlation of measured levels). We confirmed that the positions of parameters in the map of the obesity cohort are similar to the positions of parameters in a map that was constructed using the normal-BMI cohort (Figure 1C, left), despite major differences in age and BMI ranges (see details of the normal-BMI map in Figure 1—figure supplement 4A and Supplementary file 2A). In addition, the map is also consistent in separate analyses of males and females (Figure 1C, right).
Next, to assess the biological organization of the map, we examined the map within each of the four main immunometabolic categories. We noted a different distribution of each category: lipoproteins span the entire boundary of the circle, hemodynamic parameters are primarily localized at the top-right edge of the circle, metabolic parameters are localized in two defined regions (on the left and top-right sides of the circle), and immunological parameters do not have specific localization (Figure 1D and Figure 1—figure supplement 4B). We then focused on the organization within the lipoprotein parameters, which consist of multiple measures in 10 lipoprotein subfractions of the ‘endogenous’ lipoprotein pathway – a central pathway of a gradual change from the subfraction of large very-low-density lipoproteins (XXL-VLDLs) to the subfraction of small low-density lipoproteins (S-LDLs) (Figure 1E, right). We found that for each of the particular measures, there is a clear correspondence between the order of subfractions along the known endogenous pathway and the anticlockwise order of parameters along the map (Figure 1E and Figure 1—figure supplement 4C), indicating that the circle reflects gradual changes in the maturation of lipoproteins. For instance, all 10 parameters of ‘triglyceride content’ within 10 different lipoprotein subfractions are ordered along the circular pattern by their known order in the endogenous pathway. Thus, the map is organized by biological properties of the various clinical and physiological parameters.
Collectively, systematic dissection of the natural diversity in healthy subjects generated a comprehensive and reproducible map that can be used as a reference for subsequent analyses of the whole-body state.
Our next goal was to characterize the overall physiological status of any given individual. To study individual subjects, we started with a personalized visualization of the map of parameters: for each individual, its measurements were presented as color coding of all parameters in the map (indicating the measured levels of all parameters in one individual). Surprisingly, in 91% of the individuals from the two healthy cohorts, we found a global gradient of measurements over the map: in each individual, the measured levels varied from negative (i.e., lower than average) values in one side of the map to positive (i.e., higher than average) values in the opposite side of the map (e.g., obese individuals #1–8 in Figure 2A and normal-BMI individuals #9–18 in Figure 2—figure supplement 1A). For example, individual #1 has positive measured levels at the top-right side and negative measured levels at the bottom-left side of the map. Individuals are distinctively defined by their spatial direction of their gradients (e.g., eight different directions in individuals #1–8) and by their extent of imbalance between the two sides of the gradient (e.g., strong and moderate imbalanced state in individuals #9–16 versus #17–18, respectively). For 9% of the individuals, there were no major differences in measured levels between the two opposing sides of the map – that is, these individuals are in a ‘balanced state’ without any observed gradient (e.g., individuals #19 and #20 in Figure 2—figure supplement 1A).
Given that most individuals are characterized by a certain global gradient over a two-dimensional map, only two quantitative scores are needed to describe the overall phenotypic state of each individual. In accordance, each individual is described by two scores, called ‘IM1 state’ and ‘IM2 state’, which reflect the contributions of the respective axes IM1 and IM2 to the global gradient (Materials and methods). Importantly, the two scores explain substantial fractions of the variation in clinical data, including the variation of each parameter across individuals (the ‘percentage of explained inter-individual variation’) and among parameters within individuals (the ‘percentage of explained inter-parameter variation’) (Figure 2B and Figure 2—figure supplement 1B,C). For instance, in the obesity cohort, for 50% of the individuals, the two scores explain 50% or more of the total inter-parameter variation; similarly, for 50% of the clinical parameters studied, the two scores explain 40% or more of the total inter-individual variation in the obesity cohort. Even when the map from one cohort was used to calculate scores of individuals from the other cohort, the scores still explain substantial percentages of the variation (Figure 2—figure supplement 1D). Furthermore, the calculation of IM1 and IM2 states is robust to different sets of parameters (Figure 2—figure supplement 2). Thus, these data suggest that the combination of only two numbers (i.e., the IM1 and IM2 states) is sufficient to specify the ‘whole-body state’ for the clinical parameters included in our cohort data.
As an overview of the whole-body state across individuals, each individual is described by a point whose coordinates are its IM1-state and IM2-state scores (obesity cohort: Figure 2C, normal-BMI cohort: Figure 2—figure supplement 1E; individuals #1–20 are highlighted). This presentation is referred to as a ‘landscape of states’. According to the definition of the IM1 and IM2 states, the direction and distance of an individual from the center of the landscape reflect the direction and extent of its gradient along the map of parameters (Materials and methods). Importantly, the landscape of states indicates that (i) there is a similar diversity of IM1 and IM2 states in the healthy population, and (ii) the two states are decoupled, with the observation that healthy individuals span the entire spectrum of IM1/2-state combinations. These two observations suggest that the IM1 and IM2 states are of similar importance. Consistent with this notion, the two states indeed have similar contribution to phenotypic diversity: whereas the IM1 state explains 25% (on average) of the variation among obese individuals, the IM2 state explains, on average, additional 25% of this variation (Figure 2B); similarly, the IM1 and IM2 states explain 20% and 28% of the variation among normal-BMI individuals, respectively (Figure 2—figure supplement 1B). Particularly, the best explained parameters for each spectrum of states were found at the two extremities of its relevant axis (Figure 2—figure supplement 3 and Supplementary file 2A).
Collectively, we identified two decoupled states (IM1 and IM2) that together describe the human body with respect to a wide array of clinical and physiological measures. The two states have similar contribution to the overall phenotypic diversity.
Using the landscape, we were able to demonstrate that distinct states are marked by different clinical characteristics (Figure 3A). For example, IM1-positive individuals showed high percentage of monounsaturated fatty acids (MUFAs) and a high waist circumference; IM1-negative individuals showed high high-density lipoprotein (HDL) cholesterol and adiponectin; IM2-positive individuals showed high LDL cholesterol levels and high percentage of basophils; and IM2-negative individuals showed an elevated response of pro-inflammatory cytokines (TNF-α, IL-6, and IL-1β) to bacterial stimulations. Interestingly, IM2 is associated with cytokine responses to all four bacterial stimulations but not to the fungal stimulation (C. albicans; Figure 3B). Fungal pathogens are mainly recognized by C-type lectin receptors (CLRs), unlike the bacterial stimuli used in the study that are typically recognized by the TLR/RLR pathways (Plato et al., 2015), suggesting that pathogen-specific signal transduction may have a role in the creation or maintenance of the IM2-state spectrum. Finally, multiple parameters are markers of a composite IM1/IM2 state – e.g., the ApoB/ApoA ratio is a marker of the IM1-positive and IM2-positive subset of individuals, and triglycerides in L-VLDLs is a marker of individuals carrying the IM1-positive and IM2-negative state (Figure 3C). We note that the positions of parameters in the map correspond to their associations with the state, as exemplified in Figure 3A,C (see Figure 2—figure supplement 3 for other parameters). Thus, the map can be used as a bird’s-eye view for the particular markers of each state.
Next, to better understand the states, we focused on unique biomarkers for the IM1 and IM2 states (namely, markers of one score that are not associated with the other score; Materials and methods). The top 24 markers primarily consist of lipoprotein parameters (Figure 3D,E, Figure 3—figure supplement 1A and Supplementary file 2C), consistent with the computational construction of the map around lipoproteins (see above for definition of the map). We observed that the IM1 and IM2 markers differ strongly between HDL, LDLs, and VLDLs – particularly, IM1-specific markers consist of VLDLs/intermediate density lipoproteins (IDLs), while IM2-specific markers consist of LDL and HDL subfractions. To understand this observation, we expanded the analysis to all lipoprotein subfractions. We found a continuous change in associations along the lipoprotein pathways – for instance, the IM1 state demonstrates a gradual change in associations with a peak in XS/S-VLDLs (Figure 3—figure supplement 1B–D). One exception is the signature of associations with S-HDLs that is different from other HDL subfractions (Figure 3—figure supplement 1E).
The identified markers provide important insights into unique functions of each state. (1) The peak of associations with the IM1 state is in XS/S-VLDL parameters (Figure 3—figure supplement 1B), which is the last step in the removal of triglycerides from VLDLs to peripheral tissues. Triglycerides are removed from VLDLs into cells in the form of fatty acids and glycerol, which are subsequently used for storage and energy production by beta-oxidation, glycolysis, and gluconeogenesis (Feingold and Grunfeld, 2018). Thus, we suspect that the variation in the release of triglycerides from VLDLs, which has an effect on fatty acid metabolism, specifically associated with the IM1 state (Figure 3F). This hypothesis is consistent with the observation that IM1 (but not IM2) is linked to the composition of fatty acids in serum (Figure 3—figure supplement 1F). Furthermore, expression levels of glycolysis/gluconeogenesis genes and beta-oxidation genes are indeed associated with the IM1 (but not IM2) state across individuals (p<10−6, 0.04, respectively, state-function association test, using expression in PBMCs across the normal-BMI cohort; Materials and methods). Collectively, these findings suggest that carbohydrate metabolism in peripheral cells specifically associates with the IM1 spectrum of states. (2) IM2-positive markers include LDL levels and LDL cholesterol, which relate to the influx of cholesterol into peripheral cells and the liver (Feingold and Grunfeld, 2018), and IM2-negative markers include phospholipid and free cholesterol in S-HDLs, which relate to the pick-up of free cholesterol together with phospholipid from peripheral cells into S-HDLs (Feingold and Grunfeld, 2018; Figure 3F). The specificity of this S-HDL association (Figure 3—figure supplement 1E) is in agreement with phospholipid transfer and the esterification of free cholesterol in subsequent HDL subfractions. Thus, these findings suggest that the spectrum of IM2 states relates to variation in cholesterol homeostasis in peripheral tissues.
To identify the additional functions of the IM1 and IM2 states, we leveraged gene expression data. Specifically, we tested whether a statistically significant fraction of genes in pre-defined functional genesets are associated coordinately with the state (a ‘state-function association’ test; Methods). We identified 640 functional annotations that are significantly associated (q-value <0.05) with the IM1 or IM2 states (Figure 3—figure supplement 2A and Supplementary file 2D). When top 100 functional annotations were examined, 14% and 25% were broadly categorized as immunological and metabolic functions, respectively. Many of the functions were exclusively associated with one spectrum of state, such as antigen processing, coagulation, proteasome and electron transport chain (ETC) (IM2 positive associations); transcription and chromatin organization (IM2 negative associations); glycolysis, translation, TCA cycle, and DNA replication (IM1 positive associations); and TNF signaling via NFkB (IM1 negative associations). Other functions were significantly associated with both IM1 and IM2, such as RNA processing, transcription factor activity, and interferon/antiviral signaling. IM1 and IM2 are therefore heavily intertwined – e.g., both IM1 and IM2 relate to energy production (glycolysis – IM1; ETC – IM2), immune functions (e.g., TNF signaling – IM1; antigen presentation – IM2), and regulation of proteins (translation – IM1; proteasome – IM2). To validate the associations, we projected gene transcripts onto the map (forming an ‘extended map’, Figure 3—figure supplement 2B, Materials and methods) and then examined areal patterns of functional annotations over the map. Areal patterns were consistent with our predictions: a coordinated associations of genes with the state reflects a pronounce co-localization of these genes in the corresponding region of the map (e.g., Figure 3—figure supplement 2A,C). We further note that variation between genes within the same pathway is well fitted to the overall state (e.g., Figure 3—figure supplement 2D). Overall, the analysis provides a large repertoire of biological functions that are associated with each spectrum of states.
Taken together, our study suggests that the IM1 and IM2 states are characterized by (1) clinical characteristics – for example, adiponectin and waist circumference (IM1) and pro-inflammatory response (IM2); (2) metabolic processes – for example, cellular metabolism of carbohydrates (IM1) and cellular homeostasis of cholesterol (IM2); and (3) expression of signatures related to molecular functions such as translation, transcription, and RNA processing.
Given that MetS is an established way to characterize the state of the human body within the context of cardiometabolic risk, we compared the IM1 and IM2 states with the standard MetS classification (using the NCEP ATP-III criteria Huang, 2009). We found a strong association between the spectrum of IM1 states and the classification of MetS. First, within the landscape of states, individuals fulfilling the classical criteria for MetS tend toward the positive IM1-state scores (p<10−4, t-test, Figure 4A,B and Figure 4—figure supplement 1A). Second, within the map of parameters, all five MetS criteria – low HDL cholesterol, high triglycerides, high waist circumference, high glucose, and hypertension – are nearby the extremes of the IM1 axis (Figure 4C); circulating lipoproteins and free cholesterol that are not clustered together with MetS characteristics (Huang, 2009) are indeed located far from the IM1 axis (Figure 1A). Third, coloring of the map with the association between MetS and each clinical parameter reveals a global gradient from negative to positive associations along the IM1 axis (Figure 4C), which is in agreement with the identified association between MetS and the IM1 state. Fourth, clinical parameters are similarly associated with MetS and the IM1 state (Figure 4D). In agreement, the top markers of IM1 (but not IM2) are significantly associated with MetS (Figure 4D, Supplementary file 2C), the combination of standard MetS parameters predicted the IM1 state with high accuracy (CV r = 0.87, p<10−27, a linear regression model; Figure 1—figure supplement 1B,C), and carbohydrate metabolism has a known role in MetS (Lonardo et al., 2015). Finally, analysis of independent data indicates that MetS and the IM1 state share similar associations with disease (diabetes and cardiovascular disease [CVD], see below).
These findings suggest that the state of IM1 is a quantitative representation of the conventional MetS classification. Interestingly, we found that the top biomarkers of IM1 correlate well with MetS (Figure 4D), suggesting alternative markers that could be useful in MetS diagnosis. Some IM1-associated functions have a known role in the pathophysiology of MetS (e.g., S/XS-VLDL-triglycerides Kolovou et al., 2005); other IM1-associated functions may offer new insights into the biology of MetS.
Whereas the state of IM1 is linked to MetS, the state of IM2 has substantial contribution to phenotypic diversity but is decoupled from MetS. We therefore reasoned that if the IM2 state is associated with (immunometabolic or cardiometabolic) disease, it is a representation of a novel type of MetS that is uncoupled from the conventional definition of MetS. We therefore next sought to study the whole-body state in the context of disease.
We first asked whether patients with specific disease states are represented well by the IM1/IM2-state model that was initially constructed based on healthy subjects. In analysis of two diseases (diabetes mellitus [within the context of obesity] and systemic lupus erythematosus [SLE]; Materials and methods and Supplementary file 2E), we observed that the whole-body state of almost all disease patients is indeed represented well by the IM1/IM2-state model – that is, (i) disease patients are characterized by a global gradient over the map, (ii) there is a wide spectrum of directions and extents of these gradients, and (iii) the IM1/IM2 states of disease patients explain much of their observed inter-individual variation (Figure 5—figure supplement 1A,B). Thus, the IM1/IM2 states are a general characteristics of the human body in both health and disease. These findings led us to use the same health-based map for the calculation of states in both healthy and disease subjects, providing states that are comparable across different individual groups.
When we compared the states of healthy and disease subjects (Supplementary file 2E), we found that both the IM1 and IM2 states are associated with disease, and each disease associates with a different combination of these two states. In particular, diabetes is associated with a positive IM1 state (p<0.0002) rather than with an IM2 state (p>0.05), in agreement with the fact that MetS is strongly associated with diabetes (Balkau et al., 2007). SLE, in contrast, is associated with a positive IM2 state and a negative IM1 state (p<10−10, 10−8, respectively, Figure 5A,B). Analyses of additional cohorts show broadly similar behaviors, in both males and females (Figure 1—figure supplement 1C,D).
We reasoned that the presence of state-disease associations does not necessarily imply that the entire state of the human body is associated with disease. An alternative possibility is that particular clinical parameters led to the state-disease association because they are associated with disease and are also used for the calculation of the state. Two lines of evidence alleviate this concern. First, similar results were obtained when using a different set of parameters for the calculation of IM1/2-state scores (Figure 2—figure supplement 2), suggesting that the associations are not due to several particular parameters. Second, we used two approaches to calculate the association between each clinical parameter and disease: using a standard association test that accounts for variation in gender, age, and BMI (a ‘conventional association’) and using an extension of the association test that also accounts for variation in IM1/2 states (termed ‘state-independent associations’). This analysis confirmed that most associations with clinical parameters reflect the association with the whole-body state: (i) although the map of parameters was constructed using only healthy individuals, we observed global gradients over the map for the conventional associations between each clinical parameter and disease (Figure 5C, left), which is in agreement with the direction of associations of each disease with the IM1 and IM2 states (Figure 5A,B), and (ii) accounting for the state drastically reduced the number of significant associations (Figure 5D). Similar results were obtained using additional data sets and using an independent map of parameters (Figure 5—figure supplement 2A,B). As most parameter-specific associations are largely explained by the global state-disease association, it is unlikely that the association with the state is due to confounding parameter-specific associations.
Because each of the two states is associated with disease but only IM1 associates with the known MetS classification, our data supports the notion that the spectrum of IM2 states relates to a distinct immunometabolic syndrome, hereafter referred to as ‘non-classical MetS’. For clarity, the conventional MetS – which relates to the spectrum of IM1 states – is referred to as ‘classical MetS’ (Figure 5—figure supplement 3A).
CVD is generally associated with MetS (Balkau et al., 2007) and is therefore expected to be linked with a positive IM1 state. We found a role for both IM1 and IM2 in atherosclerosis, and hence in CVD. First, both IM1 and IM2 relate to clinical parameters with a known causal role in atherosclerosis: out of two known causal atherosclerosis factors (Ference et al., 2017; Nordestgaard, 2016), one is located nearby the most extreme point of the IM1 axis (the percentage of triglyceride in VLDL-remnants [also called ‘IDLs’]) and one is located nearby the most extreme point of the IM2 axis (LDL cholesterol; Figure 6A). More globally, using a literature survey of known CVD risk factors (Supplementary file 2F), we found that for both the IM1 and IM2 axes, their positive arms are enriched with previously documented positive CVD risk factor and their negative arms are enriched with documented negative CVD risk factors (Figure 6A). As in the case of SLE and diabetes (Figure 5C), such a trend reflects a general association between CVD and positive states of both IM1 and IM2.
Second, focusing on atherosclerosis in obese older adult males, in which reported associations between MetS and atherosclerosis are weak (Ballantyne et al., 2008; He et al., 2007; Camhi et al., 2011; International Diabetes Federation, 2006), we indeed found weak relations between positive IM1 states and atherosclerosis (Figure 6B,C). Interestingly, in addition to the expected weak association with IM1, we found a significant link between atherosclerosis and an IM2-positive state (p<0.001). These relations are further supported by the trend of parameter-specific associations over the map (Figure 6D, left), by the fact that parameter-specific associations are eliminated when accounting for the state (Figure 6D, right, and Figure 6E), by using an independent map for the calculation of states (Figure 5—figure supplement 2C), and by recalculation of association values when excluding diabetic patients (Figure 5—figure supplement 2D).
Collectively, these findings suggest that both the positive IM1 state and the positive IM2 state are risk factors for CVD. Thus, the two-component model generalizes the large number of specific risk factors for CVD (Figure 6A), offering a unified view of how specific parameters are linked to CVD. We note that the link of atherosclerosis to IM2 is only relevant for males (Figure 5—figure supplement 1E), in agreement with the understanding that the mechanisms of atherosclerosis in men are different from those in women (Fairweather, 2014; Mathur et al., 2015). Thus, whereas both classical and non-classical MetS are associated with CVD, the relative contribution of each of these syndromes likely depends on the particular context, such as ethnicity, age, gender, and BMI. Overall, our findings coincide with the interpretation of IM1-positive and IM2-positive states as two types of MetS.
What are the tripartite relations between disease, specific causal mechanisms and the whole-body state? For instance, a causal mechanism and the state could have independent effects on disease, or alternatively, the state could mediate between a causal mechanism and disease. As candidate causal mechanisms, we used clinical parameters that are associated with disease independently of the state (based on the state-independent association test). For each disease, we first describe below the candidate causal mechanisms and then consider the results. Diabetes is omitted because it is only linked to one factor (IM1), as it is associated neither with IM2 nor with any clinical parameter independently of the state (except from its defining elements; Figure 5D, right).
To study the relations in SLE, we focused on 88 genes that have significant state-independent associations (q < 10−3; Figure 5D, left) as the candidate causal mechanism. These genes are primarily responsible for interferon/antiviral signaling (p<10−61, hyper-geometric enrichment test), consistent with the known causal role of interferon/antiviral signaling in SLE (Salloum and Niewold, 2011; Supplementary file 2G). We evaluated a combined multivariate linear model consisting of three factors: IM1, IM2, and averaged expression of the interferon/antiviral-signaling genes. To quantify the relative contribution of each factor, we determined how much additional variance was explained when a given factor was added to the model. Because the different factors may have inter-relations and may interact in a nonlinear manner, all possible ordering combinations were examined. The three factors together accounted for nearly 84% of the variance in SLE. Of these 84%, interferon/antiviral signaling explained 14–80%, and IM1/IM2 explained 4–70%. IM1, IM2, and interferon signaling were independent predictors of SLE (p<10−13, 10−13, and 10−103, respectively), and each factor was also a significant predictor of disease in a separate analysis of males and females (p<0.05 in all cases). Thus, these results suggest that interferon signaling and the classical/non-classical MetS state have independent relationships with SLE (Figure 5E and Figure 5—figure supplement 3B).
The example of atherosclerosis in obese males is particularly interesting, as in our cohort none of the clinical parameters is associated with atherosclerosis when accounting for variation in the state (Figure 6E). Even LDL cholesterol, which has a known causal effect on atherosclerosis (both directly and indirectly Ference et al., 2017; Nordestgaard, 2016), is strongly associated with atherosclerosis but is completely decoupled from atherosclerosis when accounting for the IM2 state (Figure 6D,E). Based on systematic analysis of the tripartite relations between LDL cholesterol, IM2, and atherosclerosis (Figure 5—figure supplement 3C), a plausible model is that the indirect effect of LDL cholesterol on atherosclerosis in obese men is mediated through the IM2 state (Figure 6F). It is therefore likely that non-classical MetS has an effect on atherosclerosis, in addition to the known effect of specific causal mechanisms.
Overall, these findings suggest that (1) SLE has independent contributions from specific risk mechanisms and the state, and (2) LDL cholesterol affects atherosclerosis both directly and indirectly through the non-classical MetS state. We conclude that human diseases are directly related to the complex ecosystem within the human body, in addition to the contribution of specific disease mechanisms.
We also asked whether the IM1- and IM2-state scores could be predicted well using a small set of parameters, allowing practical assessment of the whole-body state. For the assessment of a given score, we selected arbitrarily four parameters from the 12 top markers (two from each side of the relevant axis, Figure 3D) and used a linear regression model to predict the score (Materials and methods). Applying the analysis 100 times, we found that the accuracy is high regardless of the specific four-parameters set used (IM1 mean r = 0.96, IM2 mean r = 0.95; cross validation Pearson’s r; Figure 3—figure supplement 3A, see example in Figure 3—figure supplement 3B). Thus, any small set of top parameters can be used for the assessment of the score. These results exemplify the power of reduced sets of parameters for the assessment of the whole-body state in practical applications.
In this study, we make use of healthy cohorts in which a vast amount of immunological, metabolic, and hemodynamic parameters were measured, and report that using a tailored transformation of these data, we could construct an informative two-dimensional map of parameters. Using this map, we found that the normal physiological state of the human body is largely characterized by two components – the IM1 state and IM2 state. Although the number of individuals in our data is relatively small, the reproducibility of the map (and thereby the reproducibility of the IM1 and IM2 states) has been validated in three independent cohorts (300-OB, 500-FG, and the REPROGRAM cohorts) and in different individual groups (men and women, health and disease, different age, and BMI groups). A large percentage of the variation in clinical parameters among individuals (50%, on average) is captured by these two components, emphasizing that the two-component model is a faithful representation of the whole-body state. To the best of our knowledge, this provides the first systematic description for the informal concept of the whole-body state and a framework to faithfully describe the state in individuals, shedding light on a longstanding debate in biomedicine.
Studies of cardiometabolic disease have defined MetS as an aggregate of co-regulated clinical parameters that are associated with a high risk of diabetes and CVD (Balkau et al., 2007). Our de novo reconstruction of the whole-body state (within healthy subjects) supports this classical definition of MetS: we show that the IM1 component of the whole-body state resembles the classical MetS – that is, the spectrum of IM1 states ranges between negative values in non-MetS individuals and positive values in MetS individuals. Importantly, we further show that the other component of the whole-body state (IM2) reflects a novel ‘non-classical MetS’: the IM2 spectrum explains substantial fraction of the variation within healthy subjects, is associated with disease risk, and is decoupled from the classical MetS. Thus, our analysis highlights a MetS-independent mode of the human body that is associated with the onset and progression of human disease.
Among our key findings is the identification of the whole-body state as a general risk factor to various complex diseases. Compared to the classical MetS model, the suggested two-component model (i) improves disease-state associations and (ii) extends the set of diseases that are associated with the whole-body state (Figure 7A). This emphasizes the importance of a combined analysis of the two types of MetS: instead of using the classical MetS as a risk factor for a few diseases, different combinations of the classical/non-classical MetS states could be used as risk factors for a larger number of diseases.
An important application of the framework is the ability to discern disease-specific markers whose associations are independent of the whole-body state (e.g., interferon/antiviral signaling in SLE). Thus, our framework offers a strategy for a superior diagnostics based on a combination of general markers (for the state) and specific markers (for a specific disease), which can be tested in future studies. Another application of the model is to study tripartite relations between specific mechanisms, the whole-body state, and disease. For example, we found that SLE has independent contributions from specific disease mechanisms and the whole-body state (Figure 5E), and LDL cholesterol affects atherosclerosis both directly and indirectly through the whole-body state (Figure 6F). Because it is enough to use a small number of markers for the calculation of the state (Figure 3—figure supplement 3), our strategy can be pursued in clinical practice. As our study cohorts are limited in size and are focused on specific genetic background/ethnicity, future work is needed to test the generality and repeatability of the identified associations and relations in additional populations and contexts.
The analysis specifies a large set of factors that are associated with non-classical MetS. These include a mixture of metabolic factors (e.g., serum cholesterol, LDL cholesterol, LDL particle number, and the composition of phospholipid/free-cholesterol in LDLs and S-HDLs), immune factors (e.g., the percentage of circulating basophils and cytokine production capacity of PBMCs to bacterial stimuli), and transcriptional regulation of intracellular functions (e.g., coagulation, antigen processing, and presentation). By leveraging the fine granularity of lipoproteins, the analysis provides specific hypotheses for the roles of cholesterol metabolism in non-classical MetS. We do not know the identity of the true drivers of non-classical MetS, as the relations between these parameters consist of multiple intertwined feedback loops.
Our unbiased approach offers a comprehensive view of how biological functions are linked to MetS. The IM1 state associates with the standard MetS parameters and with additional parameters (e.g., adiponectin, %MUFA and the composition of XS/S-VLDLs, cellular metabolism of carbohydrates, protein translation, DNA replication, and TNF signaling). These include many of the known MetS-associated functions (e.g., fatty acid metabolism; Lonardo et al., 2015), as well as previously unknown candidates. Thus, the integrated analysis allows to identify new candidates even in the heavily investigated MetS state. Given that the IM1 markers are more strongly associated with IM1 compared to the five conventional MetS parameters (Figure 4D), we hypothesize that the IM1 markers may describe in more detail the (patho)physiological MetS state of individuals. It is therefore possible that the definition of MetS might be improved with IM1-specific markers, potentially leading to a stronger association of classical MetS with cardiometabolic disease. Additional studies are required to test this hypothesis and explore the association of IM1 markers with cardiometabolic disease across large cohorts.
Through inspection of the states, we found that most individuals exist in a certain imbalanced state in which one side of the map is upregulated whereas the opposite side of the map is downregulated. Cellular immunometabolism is indeed thought to be in a ‘dynamic balance’ between antagonistic programs, such as a balance between the pro- and anti-inflammatory signaling and a balance between cell proliferation and apoptosis (Cicchese et al., 2018; Guo and Hay, 1999). Our observations suggest that similar rules of homeostatic balance are prevalent not only at the molecular and cellular level, but are also effective at the clinical and physiological scale.
Finally, our framework offers opportunities for future medical interventions targeting the state. In particular, the ability to quantify the relations between different diseases and the whole-body state provides a unique opportunity to design changes in the whole-body state (Figure 7B). For example, an attractive possibility is to use the landscape of states to design transitions from a high-risk state into a low-risk state. Given that some transitions are likely blocked by energetic boundaries, activation energy (such as drug interventions or long-term changes in lifestyle) would be required to go through these boundaries. We therefore expect that the landscape of states would have an impact on the development of future therapeutic interventions.
The 300-OB cohort, which we generated as a part of the human functional genomics project (HFGP), consists of 167 males and 135 females (Supplementary file 1A). These Dutch volunteers are all overweight or obese (BMI >27) and range in age from 53 to 79 years. The presence of diabetes and carotid atherosclerosis were determined. In particular, the presence of atherosclerotic plaques was determined by ultrasound of the carotid artery. Subjects with a recent cardiovascular event (myocardial infarction, transient ischemic attack, or stroke in the previous 6 months), a history of bariatric surgery or bowel resection, inflammatory bowel disease, renal dysfunction, increased bleeding tendency, use of oral or subcutaneous anti-coagulant therapy, or use of thrombocyte aggregation inhibitors other than acetylsalicylic acid and carbasalate calcium were excluded. The study was approved by the Ethical Committee of the Radboud University Medical Center Nijmegen (NL46846.091.13, 2013/505). Inclusion of volunteers and experiments were conducted according to Declaration of Helsinki, and all subjects signed written informed consent.
We took blood samples in the morning following an overnight fast. Participants who used lipid-lowering therapy temporarily discontinued this medication 4 weeks prior to the measurements. For each of the individuals, a variety of immune, metabolic, and hemodynamic parameters were monitored (Supplementary file 1C,D). We used a high-throughput nuclear magnetic resonance (NMR) metabolomics platform (Nightingale's Biomarker Analysis Platform) for the quantification of lipid and metabolite measures. Circulating inflammatory mediators were measured in human EDTA plasma using Enzyme Linked Immunosorbent Assay (ELISA). All individuals from the 300-OB study underwent a cardiovascular assessment. This included the measurement of carotid intima-media thickness (cIMT), plaque presence, and maximum plaque thickness. After a resting period of at least 30 min, baseline resting diameter and wall thickness of both carotid arteries were assessed by a well-trained sonographer. The cIMT and diameter measurements were performed in the proximal 1 cm straight portion of the carotid artery at three different angles (90°, 120°, and 180°) for six heart beats. The primary outcome variable was defined as the mean cIMT of the three angles. The presence of plaque was defined as focal thickening of the wall of at least 1.5× the mean cIMT or a cIMT >1.5 mm according to the Mannheim intima-media thickness consensus.
The 300-OB cohort includes 126 healthy volunteers (63 males and 63 females without atherosclerosis and diabetes), all are older adults (range from 53 to 78 years) who are either overweight or obese (BMI >27). This healthy cohort of 126 individuals is referred to as the ‘obesity cohort’ (Supplementary file 1A). Volunteers and experimental procedures that we used to generate the cohort are described in detail in ‘The 300-OB cohort’ section.
We further used the 500-FG cohort (Ter Horst et al., 2016), which includes 473 healthy individuals (202 males and 271 females) of Caucasian origin. The individuals range from 18 to 75 years of age with 82% of individuals younger than 30. Individuals had BMIs ranging from 15 to 35 with the majority (388 individuals) in the normal BMI range (18–25). None of these individuals were diagnosed with atherosclerosis or diabetes. We refer to this cohort as the ‘normal BMI cohort’ (Supplementary file 1B).
The parameters of the obesity and normal-BMI cohorts relate to three main immunometabolic categories (Supplementary file 1C,D). The first category consists of immunological parameters including the immune composition of blood (e.g., cell quantities and circulatory levels of cytokines and adipokines; 19 and 8 parameters in the obesity and normal BMI cohorts, respectively). In addition, it includes cytokine production capacity of isolated PBMCs in response to ex vivo stimulation with microbial products. For example, in the obesity cohort, data includes levels of four cytokines (IL-1β, IL-1RA, IL-6, and TNF-a) in response to five stimulations (phytohaemagglutinin [PHA], lipopolysaccharide [LPS], Staphylococcus aureus, Porphyromonas gingivalis, and Candida albicans). The second category of clinical measurements is a collection of hemodynamic parameters, including heart rate, pulse wave velocity, and blood pressure, as well as hematological parameters such as hematocrit. These hemodynamic parameters were only measured in the obesity cohort (15 parameters). The third category consists of metabolism-related parameters: concentrations of fatty acids, sugars, amino acids, lipids, and apolipoproteins in blood; insulin resistance index; and markers of body fat distribution such as visceral and subcutaneous abdominal fat (71 and 59 parameters in the obesity and normal BMI cohorts, respectively). This category also consists of a high-resolution map of lipoproteins, consisting of 10 subfractions over the VLDL to LDL ‘endogenous’ pathway and four parameters over the HDL ‘reverse transport’ pathway, with 12 compiled parameters for each subfraction – including the content of phospholipids, triglycerides, esterified cholesterol, and free cholesterol (a total of 168 lipoprotein parameters for each cohort). Importantly, MetS was assessed only in the obesity cohort but not in the normal-BMI cohort.
For each parameter, centering and scaling were done by fitting all measured levels across a given cohort to the Johnson function. The resulting ‘relative measured levels’ (in short, ‘measured levels’) were used throughout this study. The data set of all relative measured levels of all clinical parameter is referred to as ‘measured data’. Given the limited number of missing values (an average of 5% missing values across parameters), we performed all analyses without any data imputations. In all individual-level analyses, parameters with more than 65% missing values were omitted. For 68 individuals from the normal BMI cohort, RNA-seq data of PBMCs were available from Bakker et al., 2018 Raw reads of RNA-sequencing were initially normalized across the samples, and genes were centered and scaled by fitting to the Johnson function. We identified 11,810 genes that were expressed in at least 63 of the 68 individuals and used them to reconstruct the extended map of parameters.
To generate the map of parameters, we first transformed the data so as to reduce the level of noise while maintaining the original interrelations among clinical parameters. To this end, we tested the correlation between each pair of parameters across individuals. As many parameters lacked meaningful correlations with any other parameter and the lipoproteins showed exceptionally strong correlations (Figure 1—figure supplement 5A), we transformed the measured levels of each clinical and physiological parameter based on its collection of associations with the entire set of lipoprotein parameters. Thus, the transformation represents the measured data of each clinical parameter as a large set of associations with all lipoprotein parameters – substantially different from the original representation in which each clinical parameter is described by a set of measured levels in all individuals. This transformation is based on the notion that (i) associations would allow amplification of correlations (‘similarities’) among clinical parameters, since they provide a reduced level of noise: small fluctuations in measurements of specific individuals are smoothly averaged at the association level; and (ii) the associations maintain the original interrelationships among clinical parameters: if two clinical parameters behave in a similar way, the complete sets of associations with lipoprotein parameters would be potentially similar.
We tested two alternative transformation methods – a linear transformation that does not account for covariates and a covariate-corrected transformation that accounts for variation in age, gender, and BMI. In both cases, the input is a ‘measured profile’ of each clinical parameter (represented as an n-length vector of relative measured levels across the n individuals), and the output is a ‘transformed profile’ for each clinical parameter.
A linear lipoprotein-based transformation. We used the lipoprotein data as a transformation matrix that sends each clinical parameter from a profile of measurements across individuals to a new profile of relations (cosine of angles) between a given parameter and all lipoprotein parameters. In particular, for a given clinical parameter k, we applied the following linear transformation: , where is the measured profile of the -th parameter (using unit vectors for all parameters), is the 168 × n transformation matrix in which entry is the measured profile of the i-th lipoprotein parameter (i=1,…,168) in individual j (j=1,…,n), and finally is a transformed 168-length vector of the -th parameter, referred to as the ‘transformed profile’. Since the measured profiles are unit vectors, the i-th position in the transformed profile of parameter k () reflects the cosine of angle between the n-length vectors of relative measured levels for the k-th clinical parameter and the i-th lipoprotein parameter. Thus, the transformed profile of any given parameter k is a vector of all its relations with the list of lipoprotein parameters.
A covariate-corrected lipoprotein-based transformation. We reasoned that the linear transformation defines relationships of any parameter to the lipoprotein parameters, and that these relationships are possibly biased due to certain covariates such as age, gender, and BMI. To account for these potential confounding effects, we used an alternative transformed profile in which reflects the relations between clinical parameter k and lipoprotein parameter i while accounting for covariates. In particular, for a given clinical parameter k and a lipoprotein parameter i, we used a standard linear regression model , where , , , , and represent n-length vectors of data across the n individuals including the measured profiles (unit vectors) for clinical parameter k () and lipoprotein parameter i (), as well as age (), gender (), and BMI (). Using this formulation, we set . Thus, reflects the relations between the lipoprotein parameter i and the clinical parameter k while accounting for covariates.
Transformations based on associations with other sets of parameters (instead of lipoproteins) were not as effective as a lipoprotein-based transformation (Figure 1—figure supplement 5B). As both the linear transformation and the covariate-corrected transformations provided qualitatively similar results (Figure 1—figure supplement 5C), in this study we focus only on maps that were generated using the covariate-corrected transformed profiles.
Using data from a healthy cohort, the map of parameters was constructed in two steps. First, measured levels of all clinical and physiological parameters were transformed using the covariate-corrected lipoprotein-based transformation. Transformed values were then scaled so that the transformed profile of each clinical parameter had unit length. Second, the map of parameters was constructed by dimension reduction (PCA) of this transformed-and-scaled data. The two first principle components are termed ‘immunometabolic axis 1’ (in short, ‘IM1 axis’) and ‘immunometabolic axis 1’ (in short, ‘IM2 axis’).
To extend the map with gene transcripts, we focused on transcription profiles of PBMCs that were measured in 68 individuals from the normal-BMI cohort, which enabled us to project each gene onto the map of parameters. This extension was applied in two steps: First, we constructed the map using clinical and physiological parameters based on data from the normal-BMI cohort. Second, each gene parameter was projected onto the resulting PCA map (a total of 11,810 genes; see ‘Data preprocessing’ section). The result is an ‘extended map’ consisting of 11,810 new parameters where each new parameter is an expressed gene (Figure 3—figure supplement 2B). The extended map was used in the analysis of biological annotations and in all analyses of SLE.
The calculation of the whole-body state for each individual i takes as input (i) a map of parameters, where each parameter is associated with a certain coordinate on the IM1 and IM2 axes, and (ii) the relative measured levels of each parameter for individual i. The entire calculation of states relies on the relative measured levels (namely, without using the lipoprotein-based transformation).
To infer the state of individual i, we solved the following linear regression: , where is the vector of all relative measured levels of individual i, and are the vectors of coordinates of all parameters on the IM1 and IM2 axes, respectively (namely, the coordinates of all clinical parameters in the map of parameters), and and indicate the unknown 'IM1 state' and 'IM2 state' (respectively) in individual i. The IM1/2-state scores are signed based on the direction of the gradient, with a positive IM1 state (IM2 state) for upregulation in the right (top) side of the map and vice versa. For instance, the scores of an individual with downregulation in the top-left region of the map and upregulation in the bottom-right region of the map are IM1-positive and IM2-negative (e.g., individual no. #3 in Figure 2). The distance between the (,) coordinate and the (0,0) coordinate reflects the 'extent' of regulation along the gradient: the higher the distance, the stronger the upregulation and downregulation along the gradient. The two scores equal zero (or nearly zero) for individuals that lack any global gradient (e.g., individuals #19 and #20 in Figure 2—figure supplement 1A).
Overall, the two scores define the ‘whole-body state’ of each individual. If one or two of the scores differ from zero, there is an observed global gradient of relative measured levels over the map, as if there is an imbalance between two opposing sides of the map. In contrast, if both the IM1-state and IM2-state scores are zero (or nearly zero), there is a lack of global gradient, that is, there is a balance between opposing sides of the map (see indication of imbalanced and balanced states in Figure 2C and Figure 2—figure supplement 1E). Of note, the calculation of states in disease subjects should also consider potential artifacts due to disease-specific alterations, as detailed in the ‘Application of the framework in disease research’ section.
For the analysis of SLE, where clinical parameters were not available, the IM1/IM2-state scores were calculated using transcriptome data, relying on an extended map of gene transcripts. Particularly, we used the extended map that was constructed based on data from the normal-BMI cohort (i.e., the map from Figure 3—figure supplement 2B). Out of 11,810 genes in this extended map, 7443 were also expressed in the SLE data sets (see ‘Disease cohorts’ section). To calculate the states, we used only 596 of the 7443 genes, which were selected based on their relations with lipoproteins in two steps. First, for each gene, we calculated its Pearson correlation with all lipoprotein parameters based on gene expression and lipoprotein data from the normal-BMI cohort (correlations were calculated using transformed profiles). Second, genes with maximal absolute-correlation values that are higher than 0.85 were selected.
We present the calculated IM1/IM2-state scores across a cohort of individuals using a ‘landscape of state’ – a scatter plot in which each individual is described as a point whose coordinates are its IM1-state and IM2-state scores. In this presentation, the gradient of each individual over the map of parameters is specified by its particular position within this landscape: the direction of a coordinate (relative to the center of the landscape) equals the direction of the gradient along the map of parameters, and the distance of a coordinate from the center of the landscape reflects an increasing extent of imbalance between the two sides of the map. Specifically, individuals located at the center of the landscape lack any gradient (i.e., in a balanced state). Figure 2A,C and Figure 2—figure supplement 1A,E provide several examples. For instance, individual #1, whose gradient is directed toward the top-right side of the parameter map, attained IM1-positive and IM2-positive scores; in accordance, it is embedded at the top-right side of the landscape of states. Individuals #17 and #18 that have a moderate imbalance are located closer to the center compared to individuals #1–16 that have stronger imbalance. Individuals #19 and #20, which lack any global gradient, reside at the center of the landscape.
For each individual, the ‘total inter-parameter variation’ is defined as the variation in the relative measured values across all parameters. For each parameter, the ‘total inter-individual variation’ is defined as the variation in the (relative) measured levels across individuals (here, this total variation is always one due to the scaling of measurements, as detailed above). For both the inter-parameter variation and inter-individual variation, we calculated the percentages of variation that are explained the IM1-state and IM2-state scores in two steps: First, for each parameter, we regressed out the two scores from the parameter’s relative measured levels across individuals. Second, the ‘unexplained variation’ is the variation calculated when using the resulting residual values (rather than using the relative measured levels). The ‘percentage of unexplained variation’ is the percentage of unexplained from the total variation, and the 'percentage of explained variation' (either inter-individual or inter-parameter) is the remaining percentage of variation – that is, 100 minus the percentage of unexplained variation. This analysis is reported in Figure 2B, Figure 2—figure supplement 1B–D, and Figure 5—figure supplement 1B.
Biomarkers for the states were identified based on Pearson’s correlation (r) between state scores and measurements. The top biomarkers were selected in three steps. First, correlations of lipoprotein content and percentage (for the same lipid and subfraction) were averaged. Second, we averaged the correlations calculated in the obesity and normal-BMI cohorts. Third, we identified the top markers. For example, for the identification of IM1-specific markers, we filtered out all markers with high (>0.3) averaged correlation to IM2 and then selected the top parameters based on their averaged correlation with IM1 (six best positively correlated and six best negatively correlated with the IM1 state). The same cutoffs were used for the identification of IM2-specific markers. Supplementary file 2C provides the list of the identified markers.
We predicted the state using a small set of parameters. Specifically, we predicted the two scores as two independent tasks, allowing reduced costs when only one score is needed. For the assessment of a given score (IM1 state or IM2 state), we chose arbitrarily two negative parameters and two positive parameters (four parameters in total per score) from the set of 12 top markers (Supplementary file 2C). We fitted a linear regression model that used the score as the dependent variable and the four selected parameters as the explanatory variables. Prediction accuracy (Pearson’s r) was evaluated using leave-one-out cross validation.
To explore the relations of biological functions with the state, we correlated the expression levels of each gene with each state across individuals (Pearson’s r score). Next, for each state (IM1 or IM2 states), gene set enrichment analysis (GSEA) was applied (using the 'fgsea' R package) to test the similarity between the expected and observed distribution of Pearson’s r values (comparing between the gene set to all remaining genes). p-values were assessed using permutations. Significant p-values indicate state-function associations. Candidate gene sets of biological functions were downloaded from the following sources: GO, MSigDB Hallmark gene sets, Reactom, and KEGG. Significant function-state associations (q-value <0.05) are reported in Supplementary file 2D and Figure 3—figure supplement 2A.
Each analysis relies on three input types: a group of disease patients, the group of matching controls from the same cohort data, and a reference map of parameters.
Three comparisons of diseased to healthy groups were performed: (1) 26 healthy and 719 SLE female children from Banchereau et al. (GSE65391), (2) 6 healthy and 87 SLE male children from Banchereau et al. (GSE65391), and (3) 17 healthy and 131 SLE female adults from Chiche et al. (GSE49454). Since SLE cohort no. 1 is the largest, we used it as our primary analysis (Figure 5), and the remaining cohorts are used as additional support (Figure 5—figure supplement 1D and Figure 5—figure supplement 2A). These three data sets consist of gene expression profiles, and in accordance, the analysis was performed in the context of the extended map of genes from Figure 3—figure supplement 2B.
Three comparisons of diseased to healthy groups were performed: (1) 139 non-diabetic and 23 diabetic obese adults with carotid atherosclerosis from the 300-OB cohort, (2) 126 non-diabetic and 14 diabetic obese adults without atherosclerosis from the 300-OB cohort, and (3) 130 non-diabetic and 140 diabetic obese adults with coronary atherosclerosis from the ‘coronary-atherosclerosis cohort’. The coronary-atherosclerosis cohort was collected as part of the HORIZON 2020 European Research Program – ‘REPROGRAM: Targeting epigenetic REPROGRamming of innate immune cells in Atherosclerosis Management and other chronic inflammatory diseases’. The coronary-atherosclerosis cohort consisted of 229 Romanian patients with symptomatic stable coronary artery disease. All patients had angiographically documented coronary atherosclerosis and inducible ischemia at the treadmill or imaging stress testing. The distribution of age, gender, and BMI in this cohort is similar to the 300-OB cohort. We used diabetes cohort no. 1 as our primary analysis (Figure 5), whereas the two other cohorts were used as additional support (Figure 5—figure supplement 1C and Figure 5—figure supplement 2B). For all three comparisons, we used the map of parameters that was originally constructed using healthy obese individuals, as demonstrated in Figure 1A.
Two cohorts were tested. (1) An atherosclerosis male cohort, consisting of 95 men with diagnosed carotid atherosclerosis (with ultrasound) and 72 healthy men as controls. All individuals are obese older adults without cerebrovascular events from the 300-OB cohort. As a reference map of parameters, we used the map of parameters that was originally constructed using obese healthy males from Figure 1C. (2) An atherosclerosis female cohort, consisting of 67 women with diagnosed carotid atherosclerosis (with ultrasound) and 68 healthy women as controls. All individuals are obese older adults without cerebrovascular events from the 300-OB cohort. As a reference map of parameters, we used the map of parameters that was originally constructed using obese healthy females from Figure 1C. The two cohorts refer to asymptomatic atherosclerosis.
Supplementary file 2E provides details about the groups of disease and control subjects and about particular map of parameters that was used in each analysis.
We used the framework to compare healthy and diseased groups that have matching characteristics of age, gender, and BMI. All comparisons relied on a standard score of association between disease and a certain parameter. The ‘conventional association’ score was a –log p-value resulting from a linear mixed model that accounted for relevant covariates of gender, age, and BMI. We used the same formulation to test the association between each of the IM-state scores and disease. In addition, we also calculated associations between a given clinical parameter (or a gene parameter) and disease while controlling for the whole-body state – that is, we used both the IM1-state and IM2-state scores as additional covariates. The –log p-value resulting from this test is referred to as a ‘state-independent association’ score. Reported association p-values were adjusted for false discovery rate that accounts for multiple testing (q-values).
For comparison of subjects with disease to healthy subjects, the map of parameters should be constructed using matching healthy individuals from an independent cohort. For instance, we applied this strategy in the case of SLE, where the map of parameter was constructed using one cohort (the normal-BMI cohort) and the compared healthy/disease groups were derived from another cohort (Supplementary file 2E). Given that a matching healthy cohort is not always available for the construction of a map (e.g., the available independent cohorts do not have a similar list of parameters), the same healthy controls could be used for the map construction and for the healthy/disease comparison. In such case, to avoid overfitting and biases, the IM-state scores of healthy individuals are inferred in a leave-one-out manner – namely, constructing the map multiple times, each time omitting data from one particular individual for which the scores are inferred. We applied this strategy in the case of atherosclerosis and diabetes, where both the reference map and the control groups relied on the same cohort (Supplementary file 2E).
We reasoned that clinical parameters could be altered in disease independently of the state and that these parameters could have an effect on the calculation of the state in disease patients. To decouple between the state and disease-specific alterations, we refined the calculated IM1/IM2-state scores of disease patients in an iterative manner. In each iteration, the procedure removed parameters that were significantly associated independently of the two state scores and then recalculated the states. These two steps were iterated until convergence. To ensure comparability, the final set of parameters that were used to infer the two state scores in disease patients were also used to infer the two scores in the group of healthy individuals. None of the parameters were omitted in analysis of atherosclerosis. The set of significant state-independent SLE associations (Supplementary file 2G) was omitted in SLE. Glucose and Hb1ac were omitted in diabetes.
Both the obesity cohort and the normal BMI cohort were part of the Human Functional Genomics Project (http://www.humanfunctionalgenomics.org/site/) and have been previously published. The coronary-atherosclerosis cohort was collected as part of the HORIZON 2020 European Research Program - "REPROGRAM: Targeting epigenetic REPROGRamming of innate immune cells in Atherosclerosis Management and other chronic inflammatory diseases". SLE sequencing public datasets used in our analysis: GSE65391, GSE49454.
Metabolic syndrome risk for cardiovascular disease and diabetes in the ARIC studyInternational Journal of Obesity 32:S21–S24.https://doi.org/10.1038/ijo.2008.31
Low-density lipoproteins cause atherosclerotic cardiovascular disease: pathophysiological, genetic, and therapeutic insights: a consensus statement from the european atherosclerosis society consensus panelEuropean Heart Journal 41:2313–2330.https://doi.org/10.1093/eurheartj/ehz962
Mapping the protein-protein and genetic interactions of Cancer to guide precision medicineCurrent Opinion in Genetics & Development 54:110–117.https://doi.org/10.1016/j.gde.2019.04.005
Adiponectin, a key adipokine in obesity related liver diseasesWorld Journal of Gastroenterology 17:2801–2811.https://doi.org/10.3748/wjg.v17.i23.2801
Subclinical atherosclerosis and metabolic risk: role of body mass index and waist circumferenceMetabolic Syndrome and Related Disorders 9:119–125.https://doi.org/10.1089/met.2010.0079
Dynamic balance of pro- and anti-inflammatory signals controls disease and limits pathologyImmunological Reviews 285:147–167.https://doi.org/10.1111/imr.12671
Sex differences in inflammation during atherosclerosisClinical Medicine Insights: Cardiology 8:49–59.https://doi.org/10.4137/CMC.S17068
BookIntroduction to lipids and lipoproteinsEndotext.
Low-density lipoproteins cause atherosclerotic cardiovascular disease 1 evidence from genetic, epidemiologic, and clinical studies a consensus statement from the european atherosclerosis society consensus panelEuropean Heart Journal 38:2459–2472.https://doi.org/10.1093/eurheartj/ehx144
HumanNet v2: human gene networks for disease researchNucleic Acids Research 47:D573–D580.https://doi.org/10.1093/nar/gky1126
Pathophysiology of dyslipidaemia in the metabolic syndromePostgraduate Medical Journal 81:358–366.https://doi.org/10.1136/pgmj.2004.025601
Nonalcoholic fatty liver disease: a precursor of the metabolic syndromeDigestive and Liver Disease 47:181–190.https://doi.org/10.1016/j.dld.2014.09.020
Lipoprotein(a) as a cardiovascular risk factor: current statusEuropean Heart Journal 31:2844–2853.https://doi.org/10.1093/eurheartj/ehq386
Triglyceride-Rich lipoproteins and atherosclerotic cardiovascular diseaseCirculation Research 118:547–563.https://doi.org/10.1161/CIRCRESAHA.115.306249
Pattern recognition receptors in antifungal immunitySeminars in Immunopathology 37:97–106.https://doi.org/10.1007/s00281-014-0462-4
A wellness study of 108 individuals using personal, dense, dynamic data cloudsNature Biotechnology 35:747–756.https://doi.org/10.1038/nbt.3870
Interferon regulatory factors in human lupus pathogenesisTranslational Research 157:326–331.https://doi.org/10.1016/j.trsl.2011.01.006
Sex-Specific regulation of inflammation and metabolic syndrome in obesityArteriosclerosis, Thrombosis, and Vascular Biology 40:1787–1800.https://doi.org/10.1161/ATVBAHA.120.314508
Edward D JanusReviewing Editor; University of Melbourne, Australia
Matthias BartonSenior Editor; University of Zurich, Switzerland
Chao WangReviewer; Sunnybrook Research Institute, Canada
Edward D JanusReviewer; University of Melbourne, Australia
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Human diseases arise in a complex ecosystem composed of disease mechanisms and the whole-body state. The authors have provided novel insights having mapped a large collection of metabolic, hemodynamic and immune parameters, used the mapping to dissect phenotypic states and found that the whole-body state is faithfully represented by a quantitative two-dimensional model. Both components are associated with disease, but differ in their particular associations, thus opening new avenues for improved personalized diagnosis and treatment.
Decision letter after peer review:
Thank you for submitting your article "An integrative model of cardiometabolic traits identifies two types of metabolic syndrome" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Edward D Janus as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Chao Wang (Reviewer #1).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
As the editors have judged that your manuscript is of interest, but as described below revisions are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)
Irit Gat-Viks and coauthors attempted to address an intriguing and ambitious question: what constitutes an individual's whole body state. It is an exciting study but lacks some details. It expands the established concept of metabolic syndrome predisposing to diabetes and cardiovascular disease. From very detailed analysis of many metabolic, immunological and haemodynamic parameters it defines two types of metabolic syndrome IM 1 largely representing the already established metabolic syndrome and a new IM 2 non classical metabolic syndrome. Disease states are associated with either or both.
Overall, the study is very exciting and provides a thought-provoking challenge and potentially an important new paradigm. Providing further specific details as requested below would benefit general readers. The manuscript is however very long and difficult to follow in parts, so it needs to be simplified. Try to streamline the presentation and abbreviate the findings and main message to have impact.
(1) Style and overall issues: The manuscript especially the Results and Discussion should be shortened considerably and the key points need to be stated very clearly and explicitly as in this draft the messages get lost in the detail. The objective of each section needs to be set out clearly at the outset – as it already is for some – and the conclusion from each section needs to be stated clearly and concisely at the end of each section.
The final 8 parameters of interest are in supplementary figure 8 D but get lost there.
The reader at the outset expects that some of the immunological parameters may feature in the second type IM – 2 but in the end both types are primarily based on lipid parameters. This should be explained in the discussion.
The Abstract does not give information on the statistical methods employed.
The Discussion should start with the main findings. The Discussion should end with a clear conclusion and clarify how the findings can be utilized in future research and or clinically.
2) What drives the IM1 and IM2 state in their PCA plot for each of the cohort the authors investigated? The authors refer to 8 measured parameters being "sufficient" in characterizing the whole body state, and they are all lipids. How were they chosen? What would be informative is what other types of parameters drives the IM1/2 state in normal BMI and obesity cohort especially in the map of parameters where lipid transformation is used.
2) The authors subsequently generated an extended map with RNAseq data from normal BMI cohort. How gene expression data correlate with their clinical/non-clinical parameters which they initially used to define the whole body state? In particular, what are the top genes/pathways associated with IM1/2 state in healthy BMI cohort? Detailed heatmap or other comprehensive visualization should be used rather than hand picking a few pathway such as the IFN response pathway. This should be done before characterizing disease state. Overall, the study gave us an exciting proposition of a whole body state but does not tell us what constitutes the whole body state (namely what drives IM1/2).
3) Unbundling of the MetS has been attempted using cluster analysis and a principal components approach. So, what does this new approach add? How reproducible are the findings? What are the consequences of this new knowledge? What are the practical implications of the investigation? What is the value of knowing the 2 subtypes identified? More discussion required.
4) In Figure 1A, it is important to indicate on the map (Figure 1A) where each type of clinical/non-clinical parameters are located (color code perhaps such as in Figure 1—figure supplement 5). While we appreciate it is not possible to highlight every single parameter, it would be of interest to readers in general to visualize covariance of parameters within the same category. For example, it is intriguing to see IL-6 on the opposite spectrum of IL-1b/TNFa, which is not necessarily expected. The authors measured IL-6 in a wide variety of conditions. Which IL-6 did this come from? The authors spent a lot of effort demonstrating the power of their data transformation looking at single parameter correlates, but it would be of more benefit for the readers to appreciate their results in a comprehensive manner: can they visualize the correlation coefficient or IM1/2 coordinates in the form of a heatmap for example rather than a supplementary table?
5) In Figure 1D, what does the author mean by "The two maps (normal BMI and obesity) are very similar in their general architecture"? Do the authors refer to the general position of the same clinical/non-clinical parameter on the map? What happens to parameters that are not measured in both cohort? The authors state the normal BMI and obesity cohort has different parameter measured. For example, based on Supplementary file 1D, completely different immune/cytokine parameters are measured for each cohort. Are immune/cytokine parameter not at all in the map in Figure 1D? If they are, where do they land? Again, a heatmap of all parameters would be beneficial for also the normal BMI cohort.
6) It is unclear how the authors picked the genes (only 596 of 7443 available) from SLE cohort for example to generate the map as they didn't have clinical parameters measured. Did the authors picked genes correlated with lipid enzymes?
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
Thank you for choosing to send your work entitled "An integrative model of cardiometabolic traits identifies two types of metabolic syndrome" for consideration at eLife. Your resubmission and responses have been assessed by a Senior Editor in consultation with a member of the Board of Reviewing Editors.
The authors need to (Q1) highlight and discuss the 8 key parameters and (Q2) relate these to what is currently referred to the "metabolic syndrome". (Q3) The authors will thereby provide a perspective on their concepts for the reader.https://doi.org/10.7554/eLife.61710.sa1
(1a) Style and overall issues: The manuscript especially the Results and Discussion should be shortened considerably and the key points need to be stated very clearly and explicitly as in this draft the messages get lost in the detail. The objective of each section needs to be set out clearly at the outset – as it already is for some – and the conclusion from each section needs to be stated clearly and concisely at the end of each section.
As suggested, we substantially revised the Results by moving details to Materials and methods, adding conclusions at the end of each section and refining the text. We also shortened the Discussion section with a focus on the main points. Due to this reorganization, some of the Figures that originally appeared in Supplementary figure 3, Supplementary figure 4 are now included in Figure 1—figure supplement 6.
(1b) The final 8 parameters of interest are in supplementary figure 8 D but get lost there.
We thank the reviewer for this comment. We now focus on the eight parameters in the revised Figure 2D. We also add additional demonstration for the eight parameters in the new Figure 2D-II.
(1c) The reader at the outset expects that some of the immunological parameters may feature in the second type IM – 2 but in the end both types are primarily based on lipid parameters. This should be explained in the discussion.
Two observations explain the particular selection of eight parameters. First, the best markers for the states reside at the extreme sides of each axis, implying that the markers should be selected from these regions. Second, each category of parameters (immune, hemodynamic, metabolism, lipoproteins) have a different distribution over the map, and the relevant regions at the extremities of the axes primarily contain metabolism/lipoproteins. Thus, any eight selected parameters from these regions would be based on metabolic/lipoprotein parameters.
We now added several new visualizations to emphasize these points: (1) The organization of parameter-to-state correlations over the map (new Figure 2—figure supplement 3A), and the distribution over the map of the variance that is explained by IM1 and IM2 (new Figure 2—figure supplement 3B); together, these visualizations indicate that the markers should be chosen from the extreme sides of each axis. (2) The distribution of different categories of parameters over the map (new Figure 1D, new Figure 1—figure supplement 4B), indicating the lack of immune/hemodynamic parameters in the relevant regions of marker selection. We now highlight these points in the Results section.
(1d) The Abstract does not give information on the statistical methods employed.
We now highlight the pipeline of the methodology (building a map and then deriving a state based on the map) in the Abstract.
(1e) The Discussion should start with the main findings. The Discussion should end with a clear conclusion and clarify how the findings can be utilized in future research and or clinically.
We agree with the reviewer and apologize for the lack of clarity in the Discussion. As suggested, we have shortened the Discussion and now focus only on the main findings (first and third paragraphs), and three main applications (fourth to fifth paragraphs).
2) What drives the IM1 and IM2 state in their PCA plot for each of the cohort the authors investigated? The authors refer to 8 measured parameters being "sufficient" in characterizing the whole body state, and they are all lipids. How were they chosen? What would be informative is what other types of parameters drives the IM1/2 state in normal BMI and obesity cohort especially in the map of parameters where lipid transformation is used.
We agree with the reviewers that our original manuscript did not present sufficient information on the relations between parameters and the whole-body state, nor did we provide sufficient explanation for the selection of the eight markers. In the revised manuscript we therefore made the following changes:
1) We added two systematic analyses that together indicate the relation of each parameter with the wholebody state: (i) For each parameter, analysis of parameter-to-state correlation, depicted on top of the map (new Figure 2—figure supplement 3A); and (ii) For each parameter, the fraction of its total variation that is explained by the state (repeating these analyses for the IM1 state, IM2 state, or both) (new Figure 2—figure supplement 3B). These analyses indicate that all parameters residing nearby the extreme sides of each axis are tightly related to the corresponding state (subsection “The map enables identification of two main components of the whole-body state – the IM1 and IM2 states”).
2) We clearly define how the markers were selected (subsection “The map enables identification of two main components of the whole-body state – the IM1 and IM2 states”). In particular, as the two analyses above indicate that the markers should be selected from the extremities of each axis, the set of eight markers consists of the two most extreme parameters from each side of each axis. We further directly demonstrate the relations between these selected markers and the state (new Figure 2D-II).
3) We show that different selections of 8 markers provide similar accuracy, as long as the 8 markers are chosen from the two extreme sides of each axis (new Figure 2—figure supplement 3D).
4) We provide the distribution of each type of parameters (new Figure 1D). The fact that the 8 selected parameters are mainly lipids is in agreement with these distributions.
5) We show that the eight selected markers, which were originally selected based on the obesity cohort, perform well in the normal-BMI cohort (new Figure 2—figure supplement 3C).
6) We highlight these five points in the text (Results). We believe that these changes cover the various concerns that were raised by the reviewers.
3) The authors subsequently generated an extended map with RNAseq data from normal BMI cohort. How gene expression data correlate with their clinical/non-clinical parameters which they initially used to define the whole body state? In particular, what are the top genes/pathways associated with IM1/2 state in healthy BMI cohort? Detailed heatmap or other comprehensive visualization should be used rather than hand picking a few pathway such as the IFN response pathway. This should be done before characterizing disease state. Overall, the study gave us an exciting proposition of a whole body state but does not tell us what constitutes the whole body state (namely what drives IM1/2).
The reviewers raised two concerns, one related to the relations of genes with the IM1/2 states, and the other one related to the relations of biological functions with the IM1/IM2 states.
1) Relations of IM1/2 states with genes. We now added an analysis of the relations between each gene and the state (either IM1, IM2, or both) (new Figure 2—figure supplement 3A).
2) Relations of IM1/2 states with biological functions. The original manuscript included systematic analysis of functional annotations and their relations with the IM1/IM2 states. Following the reviewer’s comment, we realized that the source of this confusion was the fact that these results were presented only in supplementary table (i.e., not in the Results section, nor in any visualization). We therefore made the following changes. First, we added an overview presentation for the association of each functional annotation with the IM1 and IM2 axes (new Figure 1—figure supplement 5B). Several of the functions are highlighted in this new visualization (revised Figure 1—figure supplement 5C). Second, we provide more details about the organization of biological functions over the map (Results) and the implied relations between functions and the IM1/2 states (subsection “The map enables identification of two main components of the whole-body state – the IM1 and IM2 states”). The functional characterization of the map (and the states) is reported before the analysis of disease, as suggested by the reviewers. For clarity, we have changed the name of the relevant statistical test (now “areal pattern test”, formerly the “spatial colocalization test”). We improved the description of the method accordingly (subsection “The map enables identification of two main components of the whole-body state – the IM1 and IM2 states”).
4) Unbundling of the MetS has been attempted using cluster analysis and a principal components approach. So, what does this new approach add? How reproducible are the findings? What are the consequences of this new knowledge? What are the practical implications of the investigation? What is the value of knowing the 2 subtypes identified? More discussion required.
We thank the reviewer for this suggestion. We now revised substantially the Discussion section to highlight these points. Specifically, the IM1 state coincides with the known MetS categorization, and the IM2 state is suggested as a novel decoupled type of metabolic syndrome (second paragraph); reproducibility is emphasized (first paragraph), and the implications for disease research is demonstrated and discussed.
5) In Figure 1A, it is important to indicate on the map (Figure 1A) where each type of clinical/non-clinical parameters are located (color code perhaps such as in Figure 1—figure supplement 5). While we appreciate it is not possible to highlight every single parameter, it would be of interest to readers in general to visualize covariance of parameters within the same category. For example, it is intriguing to see IL-6 on the opposite spectrum of IL-1b/TNFa, which is not necessarily expected. The authors measured IL-6 in a wide variety of conditions. Which IL-6 did this come from? The authors spent a lot of effort demonstrating the power of their data transformation looking at single parameter correlates, but it would be of more benefit for the readers to appreciate their results in a comprehensive manner: can they visualize the correlation coefficient or IM1/2 coordinates in the form of a heatmap for example rather than a supplementary table?
We apologize for the lack of clarity, and have revised the text and figures to address these questions. (1) We now provide a comprehensive visualization (a heatmap) of correlation coefficients between all pairs of parameters (new Figure 1—figure supplement 3D, Figure 1—figure supplement 4A). We also provide the same information in a textual format (new Supplementary file 2B) and revised the text accordingly (Results). As expected, this visualization highlights the global structure of correlations between nearby parameters and anti-correlations between parameters in opposite sides of the map. (2) In the original manuscript we referred to the total concentration of IL6 in blood as “IL6”, whereas the ex vivo capacity of isolated PBMCs of IL6 secretion in response to a particular pathogen is referred to as “IL6/pathogen-name”. We agree that this is confusing. To address this, we now use the term “circulating IL6” instead of “IL6” (e.g., Supplementary file 2A, revised Figure 1A, new Figure 1—figure supplement 3C) and the text is now explicit on this point (Materials and methods [legends of new Supplementary figure 3C]). (3) As suggested, we added examples for the unexpected anticorrelation between the concentration of IL6 in circulation and the secretion capacity of different cytokines (IL-6/IL-1β/TNF-α) in response to PHA stimulation (new Figure 1—figure supplement 3C).
6) In Figure 1D, what does the author mean by "The two maps (normal BMI and obesity) are very similar in their general architecture"? Do the authors refer to the general position of the same clinical/non-clinical parameter on the map? What happens to parameters that are not measured in both cohort? The authors state the normal BMI and obesity cohort has different parameter measured. For example, based on supplementary Table 1D, completely different immune/cytokine parameters are measured for each cohort. Are immune/cytokine parameter not at all in the map in Figure 1D? If they are, where do they land? Again, a heatmap of all parameters would be beneficial for also the normal BMI cohort.
We agree with the reviewer, and apologize for the missing details in our description and supplementary information. We made the following changes: (1) As noted by the reviewers, we indeed referred to similar positions of the same parameters in the two maps. The text is now explicit on this point (Results). (2) We added two revised tables. In Supplementary file 1C, the table lists the entire collection of parameters, indicating for each parameter whether it was measured in the obesity cohort, the normal-BMI cohort, or both. In Supplementary file 2A, for each cohort, the table provides the map positions of all measured parameters (including parameters that appear in only one cohort). (3) We added two new visualizations that indicate the similar organization of the normal-BMI and obesity maps: first, the distribution of biological categories over the map (new Figure 1D versus new Figure 1—figure supplement 4B), and second, the correlation coefficients between pairs of parameters (new Figure 1—figure supplement 3D and Figure 1—figure supplement 4A). In both demonstrations, the entire set of normal-BMI parameters is presented (rather than only parameters that were measured in both cohorts).
7) It is unclear how the authors picked the genes (only 596 of 7443 available) from SLE cohort for example to generate the map as they didn't have clinical parameters measured. Did the authors picked genes correlated with lipid enzymes?
The source of confusion was our lack of clarity in the methodological description of the selection rules in the original manuscript. We did not select the genes based on data in the SLE cohort. Rather, we selected the genes based on data in the normal-BMI cohort, in which both gene expression and lipoproteins were measured. The extended map that we use for the analysis of SLE data is also the map that was originally constructed based on the normal BMI data. We have revised the text to clarify these points (Materials and methods).
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
The authors need to (Q1) highlight and discuss the 8 key parameters and (Q2) relate these to what is currently referred to the "metabolic syndrome". (Q3) The authors will thereby provide a perspective on their concepts for the reader.
The reviewers pointed out the lack of a comprehensive analysis of markers, which we apologize for and now address in full. We completely agree that markers could be used to characterize the states. As the reviewers point out, this may provide an important perspective on the concept and biological meaning of states and on the particular relations with MetS. Following the constructive comments raised, we revised our manuscript and included additional visualizations.
(Q1) highlight and discuss the 8 key parameters.
We added a thorough and unbiased analysis of the top IM1-specific and IM2-specific markers. Specifically, as we originally introduced only 8 markers that were selected arbitrarily from a larger set of 24 most-accurate markers, the list of top markers is now extended to include the entire set of best 24 markers. We include several alternative visualizations of these top markers (new Figure 3DE, new Figure 3—figure supplement 1A, new Supplementary file 2C), demonstrate the top markers in the larger context of the lipoprotein pathways (new Figure 3F, new Figure 3—figure supplement 1B-E), discuss potential biological mechanisms that could underlie the observed signatures (Results, new Figure 3—figure supplement 1F), highlight the top markers (Discussion section and Abstract), and provide the precise method for the selection of these markers (Materials and methods). In addition, we also include demonstration of significant markers that are not the top-significant markers (new Figure 3A-C, Results). Thus, the overall characterization of the state relies on insights from top significant markers, additional significant markers, and functional annotations (Results, Discussion). We thank the reviewer for this suggestion, which we believe strengthened our work.
We now realized that the difference between single markers for the state and state-prediction using a combination of markers is not clear. We originally used the combination of 8 markers to approximate the prediction of the state. These 8 markers were selected arbitrarily from a larger set of 24 best candidate markers. While we did not intend to emphasize any particular combination, it was obviously highly confusing. In the revised manuscript, we clarify the difference between these two analyses. First, we moved the two analyses to two separate sections (single markers: Results; marker combinations: Discussion). In accordance, prediction accuracy was moved to Figure 3—figure supplement 3 (former Figure 2D). Second, we added a separate methodological explanation for each of the analyses (single markers: ; marker-combinations: Materials and methods). Third, we refined the description of state-prediction in the Results section to emphasize the differences from a simple single-marker analysis. We now highlight the fact that all combinations of parameters (selected from the top 24 parameters) have high accuracy (revised Results and revised Figure 3—figure supplement 3A), implying that there is no particular interest in one specific combination. Fourth, the original combination of 8 parameters (for which we show prediction accuracy in Figure 3—figure supplement 3B) is demonstrated in detail in new Figure 3E and highlighted together with the remaining top markers in the Results, thus directly addressing the reviewer’s request.
For simplicity, we now present the analysis of functional annotations as a natural extension of the analysis of markers (revised Results, Materials and methods). We note that the reported geneset P-values in Figure 3—figure supplement 2A and Supplementary file 2D (former Figure 1—figure supplement 5B and Supplementary file 2C) are slightly different from the original submission due to two main changes: First, we revised the way we calculate the P-values in order to present the analysis as an extension of single-marker analysis (now referred to as “function-state association”, formerly referred to as “areal pattern”). Second, we now use a larger number of permutations for the calculation of P-values. As the former and new formulation are equivalent, and indeed provide very similar P-values, there is no change in the identified functions.
(Q2) relate these to what is currently referred to the "metabolic syndrome".
The relations of the identified state-specific markers with MetS are now demonstrated in several ways. First, we directly demonstrate that the associations of parameters with the IM1 state are similar to the associations of parameters with MetS (new Figure 4D). Second, we show that the top 12 markers of IM1 are significantly associated with MetS (highlighted in new Figure 4D), whereas the top 12 markers of IM2 are not associated with MetS (new Supplementary file 2C). Third, we show how each single MetS parameter relates to the IM1 state (new Figure 4—figure supplement 1B) and how the combination of the five MetS parameters predicts the IM1 state with high accuracy (new Figure 4—figure supplement 1C). These findings provide important insights: (i) a list of alternative markers that could be useful in MetS diagnosis, potentially leading to a stronger association of classical MetS with cardiometabolic disease; and (ii) novel IM1-associated parameters, the role of which in MetS is still unknown. We now clarify these points in the Results section and Discussion section.
(Q3) The authors will thereby provide a perspective on their concepts for the reader.
The revised manuscript describes more clearly the relevance of our model, which (1) provides a novel framework that can reliably describe different phenotypic states in humans and that could better predict the presence of disease, (2) provides individual markers that could potentially improve individual risk/disease prediction, (3) adds to the biological understanding of which biological processes drive disease states (Discussion). The biological understanding is obtained based on various new analyses, including analysis of individual markers (new Figure 3A-C, Results), top-significant markers (new Figure 3D-F, Results), and highlighting additional examples of functional annotations (Results).https://doi.org/10.7554/eLife.61710.sa2
- Amit Frishberg
- Irit Gat-Viks
- Amit Frishberg
- Irit Gat-Viks
- Niels P Riksen
- Mihai G Netea
- Niels P Riksen
- Mihai G Netea
- Leo AB Joosten
- Irit Gat-Viks
- Amit Frishberg
- Irit Gat-Viks
- Niels P Riksen
- Mihai G Netea
- Mihai G Netea
- Niels P Riksen
- Leo AB Joosten
- Mihai G Netea
- Adrian C Iancu
- Ioana M Dregoesc
- Bogdan A Tigu
- Niels P Riksen
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We thank all of the volunteers in the 300-OB cohort for their participation. This work was supported by the Israel Science Foundation (ISF) grant 288/16, by a European Research Council grant (ERC 637885), and by IN-CONTROL CVON grants (CVON2012-03 and CVON2018-27) of the Netherlands Heart Foundation. LABJ is supported by a Competitiveness Operational Programme grant of the Romanian Ministry of European Funds (HINT, ID P_37_762; MySMIS 103587). IG-V is a Faculty Fellow of the Edmond J Safra Center for Bioinformatics at Tel Aviv University, and is supported and by the European Union Horizon 2020 under grant agreement No. 847422. AF was supported by ISF288/16, by the Edmond J Safra Center for Bioinformatics at Tel Aviv University, and by ERC 637885. MGN is supported by a Netherlands Organization for Scientific Research Spinoza Grant (NWO SPI 94-212) and a European Research Council Advanced grant (ERC 833247 to MGN). NPR, LABJ, MGN, ACI, IMD and BAT received funding from the European Union Horizon 2020 research and innovation program REPROGRAM under grant agreement No 667837. NPR received a JTC2018 grant (‘MEMORY’) from the European Research Area Network on Cardiovascular Disease (ERA-CVD).
- Matthias Barton, University of Zurich, Switzerland
- Edward D Janus, University of Melbourne, Australia
- Chao Wang, Sunnybrook Research Institute, Canada
- Edward D Janus, University of Melbourne, Australia
© 2021, Frishberg et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Haemoglobinopathies are the commonest monogenic diseases worldwide and are caused by variants in the globin gene clusters. With over 2400 variants detected to date, their interpretation using the ACMG/AMP guidelines is challenging and computational evidence can provide valuable input about their functional annotation. While many in silico predictors have already been developed, their performance varies for different genes and diseases. In this study, we evaluate 31 in silico predictors using a dataset of 1627 variants in HBA1, HBA2, and HBB. By varying the decision threshold for each tool, we analyse their performance (a) as binary classifiers of pathogenicity, and (b) by using different non-overlapping pathogenic and benign thresholds for their optimal use in the ACMG/AMP framework. Our results show that CADD, Eigen-PC, and REVEL are the overall top performers, with the former reaching moderate strength level for pathogenic prediction. Eigen-PC and REVEL achieve the highest accuracies for missense variants, while CADD is also a reliable predictor of non-missense variants. Moreover, SpliceAI is the top performing splicing predictor, reaching strong level of evidence, while GERP++ and phyloP are the most accurate conservation tools. This study provides evidence about the optimal use of computational tools in globin gene clusters under the ACMG/AMP framework.
Cells convert electrical signals into chemical outputs to facilitate the active transport of information across larger distances. This electrical-to-chemical conversion requires a tightly regulated expression of ion channels. Alterations of ion channel expression provide landmarks of numerous pathological diseases, such as cardiac arrhythmia, epilepsy, or cancer. Although the activity of ion channels can be locally regulated by external light or chemical stimulus, it remains challenging to coordinate the expression of ion channels on extended spatial–temporal scales. Here, we engineered yeast Saccharomyces cerevisiae to read and convert chemical concentrations into a dynamic potassium channel expression. A synthetic dual-feedback circuit controls the expression of engineered potassium channels through phytohormones auxin and salicylate to produce a macroscopically coordinated pulses of the plasma membrane potential. Our study provides a compact experimental model to control electrical activity through gene expression in eukaryotic cell populations setting grounds for various cellular engineering, synthetic biology, and potential therapeutic applications.