Pan-cancer association of DNA repair deficiencies with whole-genome mutational patterns
Abstract
DNA repair deficiencies in cancers may result in characteristic mutational patterns, as exemplified by deficiency of BRCA1/2 and efficacy prediction for PARP-inhibitors. We trained and evaluated predictive models for loss-of-function (LOF) of 145 individual DDR genes based on genome-wide mutational patterns, including structural variants, indels, and base-substitution signatures. We identified 24 genes whose deficiency could be predicted with good accuracy, including expected mutational patterns for BRCA1/2, MSH3/6, TP53, and CDK12 LOF variants. CDK12 is associated with tandem-duplications, and we here demonstrate that this association can accurately predict gene deficiency in prostate cancers (area under the ROC curve=0.97). Our novel associations include mono- or biallelic LOF variants of ATRX, IDH1, HERC2, CDKN2A, PTEN, and SMARCA4, and our systematic approach yielded a catalogue of predictive models, which may provide targets for further research and development of treatment, and potentially help guide therapy.
Data availability
This study is based on analyses of human germline and cancer somatic variant data. The data sets were generated and made available by the Pan-Cancer Analysis of Whole Genomes (PCAWG) consortium and from the Hartwig Medical Foundation (HMF). The majority of the data cannot be publicly accessed as it includes protected personal data, including germline variants, which cannot be made publicly available. However, accession to the underlying data sets can be achieved through applications to ICGC/TCGA and HMF as described below.The public parts of the PCAWG data set are available at https://dcc.icgc.org/releases/PCAWG, whereas controlled files may be accessed through applications to gbGaP and DACO, which should include a project proposal, as instructed on this site https://docs.icgc.org/pcawg/data/. The ICGC study ID of the project is EGAS00001001692.The HMF data used in this project may be found by accession code DR-044 and can be obtained by submitting an application with a project proposal to the Hartwig Medical Foundation (https://www.hartwigmedicalfoundation.nl/en).Non-personal summary data have been supplied in supplementary tables S1 to S9:Supplementary Table 1: All included tumours and their primary tumour locationsSupplementary Table 2: 736 DDR genes, hg19 coordinates and the number ofpathogenic events across 6,065 cancer genomesSupplementary Table 3: All SBS signature contributions, indels counts, and1104 SV counts, per sample; zip-compressed; tab-separated values (.tsv), may be opened in Microsoft ExcelSupplementary Table 4: All SBS signature contributions, indels counts, andSV counts, per sample, log-transformed and scaled to z-scores; zip-compressed; tab-separated values (.tsv), may be opened in Microsoft ExcelSupplementary Table 5: Proposed Etiologies of base substitution signaturesSupplementary Table 6: All models (n=535)Supplementary Table 7: Pathogenic events in each of the 535 LOF-setsSupplementary Table 8: Shortlisted models (n=48)Supplementary Table 9: Correlation between features in shortlisted modelsSupplementary Table 10: Survival analysis for the shortlisted modelsThe third-party software used for data analysis includes:Pathogenicity annotation using CADD annotation software, which may be accessed at https://cadd.gs.washington.eduSignature analysis using Signature Tools Lib, which has been installed from the GitHub: https://github.com/Nik-Zainal-Group/signature.tools.libCode that we developed locally for the analysis can be accessed at:https://github.com/SimonGrund/DDR_Predict
Article and author information
Author details
Funding
Novo Nordisk Fonden (NNF15OC0016662)
- Eva R Hoffmann
Cancer Research UK (C23210/A7574)
- Eva R Hoffmann
Danmarks Frie Forskningsfond (8021-00419B)
- Jakob Skou Pedersen
Kræftens Bekæmpelse (R307-A17932)
- Jakob Skou Pedersen
Aarhus Universitets Forskningsfond (AUFF-E-2020-6-14)
- Jakob Skou Pedersen
Sundhedsvidenskabelige Fakultet, Aarhus Universitet (PhD stipend)
- Simon Grund Sørensen
Sundhed, Region Midtjylland (A2972)
- Gustav Alexander Poulsgaard
Danmarks Grundforskningsfond (DNRF115)
- Eva R Hoffmann
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Human subjects: We analysed data generated and made available by the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) as well as the Hartwig Medical Foundation (HMF). The research conforms to the principles of the Helsinki Declaration.
Reviewing Editor
- W Kimryn Rathmell, Vanderbilt University Medical Center, United States
Publication history
- Received: June 20, 2022
- Accepted: February 26, 2023
- Accepted Manuscript published: March 8, 2023 (version 1)
Copyright
© 2023, Sørensen et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 361
- Page views
-
- 108
- Downloads
-
- 0
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Cancer Biology
- Computational and Systems Biology
Lung squamous cell carcinoma (LUSC) is a type of lung cancer with a dismal prognosis that lacks adequate therapies and actionable targets. This disease is characterized by a sequence of low- and high-grade preinvasive stages with increasing probability of malignant progression. Increasing our knowledge about the biology of these premalignant lesions (PMLs) is necessary to design new methods of early detection and prevention, and to identify the molecular processes that are key for malignant progression. To facilitate this research, we have designed XTABLE (Exploring Transcriptomes of Bronchial Lesions), an open-source application that integrates the most extensive transcriptomic databases of PMLs published so far. With this tool, users can stratify samples using multiple parameters and interrogate PML biology in multiple manners, such as two- and multiple-group comparisons, interrogation of genes of interests, and transcriptional signatures. Using XTABLE, we have carried out a comparative study of the potential role of chromosomal instability scores as biomarkers of PML progression and mapped the onset of the most relevant LUSC pathways to the sequence of LUSC developmental stages. XTABLE will critically facilitate new research for the identification of early detection biomarkers and acquire a better understanding of the LUSC precancerous stages.
-
- Biochemistry and Chemical Biology
- Cancer Biology
Cancer secretome is a reservoir for aberrant glycosylation. How therapies alter this post- translational cancer hallmark and the consequences thereof remain elusive. Here we show that an elevated secretome fucosylation is a pan-cancer signature of both response and resistance to multiple targeted therapies. Large-scale pharmacogenomics revealed that fucosylation genes display widespread association with resistance to these therapies. In cancer cell cultures, xenograft mouse models, and patients, targeted kinase inhibitors distinctively induced core fucosylation of secreted proteins less than 60 kDa. Label-free proteomics of N-glycoproteomes identified fucosylation of the antioxidant PON1 as a critical component of the therapy-induced secretome (TIS). N-glycosylation of TIS and target core fucosylation of PON1 are mediated by the fucose salvage-FUT8-SLC35C1 axis with PON3 directly modulating GDP-Fuc transfer on PON1 scaffolds. Core fucosylation in the Golgi impacts PON1 stability and folding prior to secretion, promoting a more degradation-resistant PON1. Global and PON1-specific secretome de-N-glycosylation both limited the expansion of resistant clones in a tumor regression model. We defined the resistance-associated transcription factors (TFs) and genes modulated by the N-glycosylated TIS via a focused and transcriptome-wide analyses. These genes characterize the oxidative stress, inflammatory niche, and unfolded protein response as important factors for this modulation. Our findings demonstrate that core fucosylation is a common modification indirectly induced by targeted therapies that paradoxically promotes resistance.