Biomarkers: Getting closer to the clinic

Associations between plasma protein levels and DNA methylation patterns can be used to predict the onset of age-related chronic disease.
  1. Toshiko Tanaka
  2. Luigi Ferrucci  Is a corresponding author
  1. Intramural Research Program, National Institute on Aging, National Institutes of Health, United States

Lengthening life expectancies and decreased mortality rates have led to an unprecedented expansion of the older population. At the same time, chronic diseases – which often affect older individuals – have become more prevalent. Healthcare systems around the world are falling short of this challenge, in part because they remain focused on preventing and curing one disease at a time, even though 80% of clinical patients over 60 have multiple diseases at once.

Even when one specific disease causes most of a person’s symptoms, older patients often have co-existing conditions that affect the course, treatment and prognosis of the main disease. Pressed for time, physicians often ignore underlying illnesses until they begin to seriously affect the patient’s health or start causing frailty. There is no easy solution to this rising crisis, but the emerging field of biomarkers may soon come to the aid of clinicians.

Biomarkers are molecules, genes or characteristics that can be used to detect or predict the onset of a disease. Traditionally, biomarkers have included circulating levels of plasma proteins, lipids and other metabolites. More recently, epigenetic markers – chemical modifications of DNA that affect whether genes are turned on or off, such as addition of methyl groups at specific DNA sites – have shown promise as biomarkers for age-related conditions. Using biomarkers could allow physicians to obtain a molecular map of a patient’s health from a single drop of blood. This would allow clinicians to detect illnesses before they become symptomatic, which is particularly important in the case of serious conditions that could become chronic (Tanaka et al., 2020).

Developing algorithms that extract the relevant information from biomarkers in the blood is perhaps the most promising and potentially powerful line of research in chronic diseases. Until recently, most biomarker studies examined one layer of information (DNA modifications, protein levels or specific metabolites) at a time. However, combining information on DNA methylation with the level of a small number of circulating proteins has been shown to predict the risk of specific chronic diseases as well as global, adverse health outcomes such as having several illnesses at once, and mortality (Lu et al., 2019; Levine et al., 2018; Belsky et al., 2020). Now, in eLife, Riccardo Marioni from the University of Edinburgh and colleagues – including Danni Gadd, Robert Hillary, Daniel McCartney and Shaza Zaghlool as joint first authors – report on how to leverage the associations between DNA methylation and protein levels to predict the onset of disease earlier and more accurately (Gadd et al., 2022).

The team (who are based in the United Kingdom, the United States, Germany, Australia and Qatar) first measured the abundance of 953 proteins in the blood plasma of people in the German KORA cohort (an epidemiological study that ran from 1984 to 2001 in Augsburg and evaluated participants every five years, with an emphasis on major chronic diseases) and the Scottish Lothian Birth Cohort 1936 (the surviving participants of the Scottish Mental Survey 1947 who now live in the Lothian area of Scotland). Gadd et al. then used machine learning to identify clusters of specific DNA methylation sites that could predict the levels of each protein in the plasma. This data was used to assign an epigenetic score or ‘EpiScore’ to each protein. Using this approach, Gadd et al. found that their new algorithm could predict between 1% and 58% of the variation between different people in the plasma levels of 109 proteins.

Next, the team applied the EpiScores of the 109 proteins to data from an independent epidemiological study called Generation Scotland to test whether it was possible to predict the onset of 11 major chronic diseases, as well as death, over a follow-up period of 14 years (Figure 1). This resulted in the identification of 137 connections between EpiScores and 11 diseases or death. Some EpiScores predicted the onset of selected conditions but other were associated with multiple conditions and, perhaps unsurprisingly, the results also suggested a strong correlation between inflammation and age-related chronic disease.

Epigenetic scores of plasma proteins predict onset of major chronic diseases over 14 years.

A machine learning approach was used to find associations, called EpiScores, between DNA methylation (top left) and the abundance of 953 plasma proteins (top right). The results identified 109 proteins with EpiScores that explained between 1% and 58% of the variance in their levels. These scores were then applied to an epidemiological study that contains the medical records of 1,537 individuals over the course of 14 years. Gadd et al. found 137 connections between these EpiScores and 11 age-related conditions (represented by icons), and also between the EpiScores and mortality (represented by the survival graph).

One of the notable observations (that has also been reported in previous studies) is that these analyses on EpiScores confirmed known associations between certain proteins and diseases, even when there is only a moderate correlation between the EpiScore and the protein. This suggests that EpiScores are not a mere proxy for plasma protein levels, but may contain different information about disease risk. In the future, it is likely that biomarkers for disease will encompass multiple molecular layers, such as protein levels together with epigenetic markers or metabolite composition.

The findings of Gadd et al. offer a glimpse into a possible future of medicine. One could imagine a busy physician evaluating a 75-year-old patient complaining of sudden back pain. The physician collects a small blood sample and analyzes it using a fast robotized laboratory connected to a powerful computer that can measure molecular biomarkers and assign a ‘health score’. The computer would then provide information about the patient’s risk for potential diseases that the physician can address before they become symptomatic. The systematic use of this technology could increase awareness and understanding of co-existing, but not yet visible, medical problems.

Of course, before this can happen more research is needed. The predictivity of some EpiScores is modest and only adequate for risk prediction. Even in this context, it would be important to understand whether performing early interventions on patients with high scores is cost effective. As always in prevention, there is a trade-off between the stigmata of tagging an individual as ‘high risk’ and how this information can be used to improve health. A study in which information about proteins and DNA methylation is first compared ‘head to head’ in the same large cohort, and then combined, could reveal whether these two biomarkers provide complementary information and increase specificity. Over time, the data collected systematically using this approach and surveillance studies of electronic medical records could help identify common co-morbidities, allowing clinicians to develop more effective strategies for treating patients with complex combinations of diseases.


Article and author information

Author details

  1. Toshiko Tanaka

    Toshiko Tanaka is in the Intramural Research Program, National Institute on Aging, National Institutes of Health, Baltimore, United States

    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4161-3829
  2. Luigi Ferrucci

    Luigi Ferrucci is the Scientific Director of the Intramural Research Program, National Institute on Aging, National Institutes of Health, Baltimore, United States

    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6273-1613

Publication history

  1. Version of Record published: February 25, 2022 (version 1)


This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.


  • 425
    Page views
  • 49
  • 0

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Toshiko Tanaka
  2. Luigi Ferrucci
Biomarkers: Getting closer to the clinic
eLife 11:e77180.

Further reading

    1. Biochemistry and Chemical Biology
    2. Epidemiology and Global Health
    Lang Pan et al.
    Research Article


    Few studies have assessed the role of individual plasma cholesterol levels in the association between egg consumption and the risk of cardiovascular diseases. This research aims to simultaneously explore the associations of self-reported egg consumption with plasma metabolic markers and these markers with the risk of cardiovascular disease (CVD).


    Totally 4778 participants (3401 CVD cases subdivided into subtypes and 1377 controls) aged 30–79 were selected based on the China Kadoorie Biobank. Targeted nuclear magnetic resonance was used to quantify 225 metabolites in baseline plasma samples. Linear regression was conducted to assess associations between self-reported egg consumption and metabolic markers, which were further compared with associations between metabolic markers and CVD risk.


    Egg consumption was associated with 24 out of 225 markers, including positive associations for apolipoprotein A1, acetate, mean HDL diameter, and lipid profiles of very large and large HDL, and inverse associations for total cholesterol and cholesterol esters in small VLDL. Among these 24 markers, 14 were associated with CVD risk. In general, the associations of egg consumption with metabolic markers and of these markers with CVD risk showed opposite patterns.


    In the Chinese population, egg consumption is associated with several metabolic markers, which may partially explain the protective effect of moderate egg consumption on CVD.


    This work was supported by the National Natural Science Foundation of China (81973125, 81941018, 91846303, 91843302). The CKB baseline survey and the first re-survey were supported by a grant from the Kadoorie Charitable Foundation in Hong Kong. The long-term follow-up is supported by grants (2016YFC0900500, 2016YFC0900501, 2016YFC0900504, 2016YFC1303904) from the National Key R&D Program of China, National Natural Science Foundation of China (81390540, 81390541, 81390544), and Chinese Ministry of Science and Technology (2011BAI09B01). The funders had no role in the study design, data collection, data analysis and interpretation, writing of the report, or the decision to submit the article for publication.

    1. Epidemiology and Global Health
    2. Medicine
    Botond Antal et al.
    Research Article

    Background: Type 2 diabetes mellitus is known to be associated with neurobiological and cognitive deficits; however, their extent, overlap with aging effects, and the effectiveness of existing treatments in the context of the brain are currently unknown.

    Methods: We characterized neurocognitive effects independently associated with T2DM and age in a large cohort of human subjects from the UK Biobank with cross-sectional neuroimaging and cognitive data. We then proceeded to evaluate the extent of overlap between the effects related to T2DM and age by applying correlation measures to the separately characterized neurocognitive changes. Our findings were complemented by meta-analyses of published reports with cognitive or neuroimaging measures for T2DM and healthy controls (HC). We also evaluated in a cohort of T2DM diagnosed individuals using UK Biobank how disease chronicity and metformin treatment interact with the identified neurocognitive effects.

    Results: The UK Biobank dataset included cognitive and neuroimaging data (N=20,314) including 1,012 T2DM and 19,302 HC, aged between 50 and 80 years. Duration of T2DM ranged from 0-31 years (mean 8.5±6.1 years); 498 were treated with metformin alone, while 352 were unmedicated. Our meta-analysis evaluated 34 cognitive studies (N=22,231) and 60 neuroimaging studies: 30 of T2DM (N=866) and 30 of aging (N=1,088). As compared to age, sex, education, and hypertension-matched HC, T2DM was associated with marked cognitive deficits, particularly in executive functioning and processing speed. Likewise, we found that the diagnosis of T2DM was significantly associated with gray matter atrophy, primarily within the ventral striatum, cerebellum, and putamen, with reorganization of brain activity (decreased in the caudate and premotor cortex and increased in the subgenual area, orbitofrontal cortex, brainstem and posterior cingulate cortex). The structural and functional changes associated with T2DM show marked overlap with the effects correlating with age but appear earlier, with disease duration linked to more severe neurodegeneration. Metformin treatment status was not associated with improved neurocognitive outcomes.

    Conclusions: The neurocognitive impact of T2DM suggests marked acceleration of normal brain aging. T2DM gray matter atrophy occurred approximately 26% ± 14% faster than seen with normal aging; disease duration was associated with increased neurodegeneration. Mechanistically, our results suggest a neurometabolic component to brain aging. Clinically, neuroimaging-based biomarkers may provide a valuable adjunctive measure of T2DM progression and treatment efficacy based on neurological effects.

    Funding: The research described in this paper was funded by the W. M. Keck Foundation (to LRMP), the White House Brain Research Through Advancing Innovative Technologies (BRAIN) Initiative (NSFNCS-FR 1926781 to LRMP), and the Baszucki Brain Research Fund (to LRMP). None of the funding sources played any role in the design of the experiments, data collection, analysis, interpretation of the results, the decision to publish, or any aspect relevant to the study. DJW reports serving on data monitoring committees for Novo Nordisk. None of the authors received funding or in-kind support from pharmaceutical and/or other companies to write this manuscript.