Risk Modeling: Predicting cancer risk based on family history

A new software package provides more accurate cancer risk prediction profiles and has the ability to integrate more genes and cancer types in the future.
  1. Michelle F Jacobs  Is a corresponding author
  1. Internal Medicine, University of Michigan, United States

Countless hours have been dedicated to researching cancer – how to prevent it, how to diagnose it early, and how to treat it. Yet, cancer remains a leading cause of death worldwide, accounting for almost 10 million fatalities in 2020.

Most cancers are caused by changes to genes that happen over a person’s lifetime. In rarer cases (about 5–10%), they start due to inherited genetic mutations that produce a predisposition to cancer. In these instances, also known as familial or hereditary cancer syndromes, the mutation is passed down from generation to generation. In these families, more members tend to develop cancers than expected – often of the same or related type – which can also start at a particularly early age.

It is important to identify people with such genetic mutations so that they – and any family members at higher risk – can undergo enhanced cancer screening. Family history can be a useful predictor of hereditary cancer risk (Blackford and Parmigiani, 2010). As such, risk prediction models that incorporate family history to estimate a person’s chance of having a mutation in a cancer predisposition gene or of developing cancer have been employed for many years (Chen et al., 2004).

Historically, such models have been particularly valuable for deciding who to offer genetic testing to when only few and often costly genetic tests were available (Fasching et al., 2007). In some cases, insurance companies require the risk estimate related to carrying a cancer-related genetic mutation to exceed a certain threshold (typically 5 or 10%) to reimburse the cost of a genetic test (Chen et al., 2006). As research advances, the number of genes available for cancer-related genetic testing has now reached over 100 and is likely to continue increasing. Nevertheless, older risk modeling programs generally include only a small number of genes in their predictions. Now, in eLife, Danielle Braun and colleagues – including Gavin Lee and Jane Liang as joint first authors – report on a new software package that has the capacity to evolve alongside advances in cancer research (Lee et al., 2021).

The researchers, who are based at ETH Zürich, EPFL, Harvard, the Dana-Farber Cancer Institute, and the Broad Institute, developed PanelPRO, a tool that uses evidence gathered from extensive literature reviews to model the complex interplay between genes and cancer risk. PanelPRO’s workflow consists of four main parts: input, preprocessing, algorithm, and output (Figure 1).

Workflow for PanelPRO.

First, information on family history, including cancer diagnoses, age of relatives, and cancer risk factors is added into the risk modeling software PanelPRO (input, blue box on the left). Then, PanelPRO validates data formatting (preprocessing, grey oval), and analyses information about frequency and cancer risks for family cancer syndromes (algorithm, grey box) to estimate the likelihood of a person in a family having a mutation in a gene linked to an increased risk of cancer (output, green box on the right). Mutation probability and cumulative cancer risk are given as a probability between 0.0 (no risk) and 1.0 (100% risk).

The user first adds information about a history of cancers in a family – such as ages and cancer diagnoses – and other factors that might affect cancer risk. These include any risk-reducing surgeries in relatives, or tumors with biomarkers that might indicate a potential hereditary cause of their cancer. The software then adds information on the frequency of different hereditary cancer syndromes and assesses their associated cancer risks. PanelPRO can currently accommodate 18 types of cancer and generate predictions of probable mutations for 24 genes, but its code allows for the addition of new cancers or cancer-related genes that may be identified in the future.

During the preprocessing stage, the software verifies the input for any missing information and data, and also for any family relationships not supported by the software, such as ‘double cousins’, which occur when two siblings have children with two siblings from another family. Messages, warnings, or errors may be given to the user if any issues are detected.

After the information has been checked and modified as needed, the model proceeds to the algorithm stage. To calculate the output, the algorithm uses probabilities based on the family history, the frequency of hereditary cancer syndromes in the population, and the cancer history that would be expected if a cancer syndrome were present. The program then estimates the likelihood of a person in a family to have a mutation in a gene linked to an increased risk of cancer. These calculations can also be easily run for other family members using the existing information. It also shows a personalized estimate of future cancer risks. Users can choose which cancer types and genes to display.

However, some outstanding issues remain. Misreported family history information, such as an inaccurate cancer diagnosis or unknown age of diagnosis, can significantly affect estimates, highlighting that accuracy of patient-reported information is key to producing correct estimates (Katki, 2006). While patients have been shown to generally provide exact information on cancer history for first-degree relatives, the accuracy of these reports decreases for more distant relations (Augustinsson et al., 2018; Murff et al., 2004).

Moreover, analyses with a similar risk modeling software have revealed that a strict adherence to a 10% risk threshold to qualify for a test for a probable mutation in the BRCA gene (which is linked to an increased risk of developing breast, ovarian, and other cancers) would miss around 25% of individuals carrying a mutation when compared to genetic testing outcomes (Varesco et al., 2013). This is likely because cancer risks associated with hereditary cancer syndromes are more variable than initially appreciated, and not all family histories may exhibit a predictable pattern of cancer, even when a mutation is present (Okur and Chung, 2017). This complicates risk assessments and argues against making decisions about genetic testing solely based on risk prediction models. Today, broader insurance coverage guidelines and lower costs for genetic tests have increased clinicians’ ability to order these tests, even if certain risk thresholds are not met based on family history.

Nevertheless, the higher number of genes and cancer types supported by PanelPRO compared to other risk models are impressive and its ability to incorporate new genes and cancer types as testing advances are key in this fast-paced, constantly advancing field.


  1. Book
    1. Blackford A
    2. Parmigiani G
    (2010) Familial cancer risk assessment using BayesMendel
    In: Michael F. O, John T. C, editors. Biomedical Informatics for Cancer Research. Springer. pp. 301–314.

Article and author information

Author details

  1. Michelle F Jacobs

    Michelle F Jacobs is in the Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, United States

    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0458-1952

Publication history

  1. Version of Record published: September 29, 2021 (version 1)


© 2021, Jacobs

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 910
    Page views
  • 52
  • 3

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Michelle F Jacobs
Risk Modeling: Predicting cancer risk based on family history
eLife 10:e73380.

Further reading

    1. Biochemistry and Chemical Biology
    2. Cancer Biology
    Pengfei Guo, Rebecca C Lim ... Hui Zhang
    Research Article Updated

    The Polycomb Repressive Complex 2 (PRC2) methylates H3K27 to regulate development and cell fate by transcriptional silencing. Alteration of PRC2 is associated with various cancers. Here, we show that mouse Kdm1a deletion causes a dramatic reduction of PRC2 proteins, whereas mouse null mutation of L3mbtl3 or Dcaf5 results in PRC2 accumulation and increased H3K27 trimethylation. The catalytic subunit of PRC2, EZH2, is methylated at lysine 20 (K20), promoting EZH2 proteolysis by L3MBTL3 and the CLR4DCAF5 ubiquitin ligase. KDM1A (LSD1) demethylates the methylated K20 to stabilize EZH2. K20 methylation is inhibited by AKT-mediated phosphorylation of serine 21 in EZH2. Mouse Ezh2K20R/K20R mutants develop hepatosplenomegaly associated with high GFI1B expression, and Ezh2K20R/K20R mutant bone marrows expand hematopoietic stem cells and downstream hematopoietic populations. Our studies reveal that EZH2 is regulated by methylation-dependent proteolysis, which is negatively controlled by AKT-mediated S21 phosphorylation to establish a methylation-phosphorylation switch to regulate the PRC2 activity and hematopoiesis.

    1. Cancer Biology
    Sen Qin, Yawei Xu ... Zheng Zhang
    Research Article

    Pheochromocytomas (PCCs) are rare neuroendocrine tumors that originate from chromaffin cells in the adrenal gland. However, the cellular molecular characteristics and immune microenvironment of PCCs are incompletely understood. Here, we performed single-cell RNA sequencing (scRNA-seq) on 16 tissues from 4 sporadic unclassified PCC patients and 1 hereditary PCC patient with Von Hippel-Lindau (VHL) syndrome. We found that intra-tumoral heterogeneity was less extensive than the inter-individual heterogeneity of PCCs. Further, the unclassified PCC patients were divided into two types, metabolism-type (marked by NDUFA4L2 and COX4I2) and kinase-type (marked by RET and PNMT), validated by immunohistochemical staining. Trajectory analysis of tumor evolution revealed that metabolism-type PCC cells display phenotype of consistently active metabolism and increased metastasis potential, while kinase-type PCC cells showed decreased epinephrine synthesis and neuron-like phenotypes. Cell-cell communication analysis showed activation of the annexin pathway and a strong inflammation reaction in metabolism-type PCCs and activation of FGF signaling in the kinase-type PCC. Although multispectral immunofluorescence staining showed a lack of CD8+ T cell infiltration in both metabolism-type and kinase-type PCCs, only the kinase-type PCC exhibited downregulation of HLA-I molecules that possibly regulated by RET, suggesting the potential of combined therapy with kinase inhibitors and immunotherapy for kinase-type PCCs; in contrast, the application of immunotherapy to metabolism-type PCCs (with antigen presentation ability) is likely unsuitable. Our study presents a single-cell transcriptomics-based molecular classification and microenvironment characterization of PCCs, providing clues for potential therapeutic strategies to treat PCCs.