Lung cancer (LC) prognosis is closely linked to the stage of disease when diagnosed. We investigated the biomarker potential of serum RNAs for the early detection of LC in smokers at different prediagnostic time intervals and histological subtypes. In total, 1061 samples from 925 individuals were analyzed. RNA sequencing with an average of 18 million reads per sample was performed. We generated machine learning models using normalized serum RNA levels and found that smokers later diagnosed with LC in 10 years can be robustly separated from healthy controls regardless of histology with an average area under the ROC curve (AUC) of 0.76 (95% CI, 0.68-0.83). Furthermore, the strongest models that took both time to diagnosis and histology into account successfully predicted non-small cell LC (NSCLC) between 6 to 8 years, with an AUC of 0.82 (95% CI, 0.76-0.88), and SCLC between 2 to 5 years, with an AUC of 0.89 (95% CI, 0.77-1.0), before diagnosis. The most important separators were microRNAs, miscellaneous RNAs, isomiRs and tRNA-derived fragments. We have shown that LC can be detected years before diagnosis and manifestation of disease symptoms independently of histological subtype. However, the highest AUCs were achieved for specific subtypes and time intervals before diagnosis. The collection of models may therefore also predict the severity of cancer development and its histology. Our study demonstrates that serum RNAs can be promising prediagnostic biomarkers in a LC screening setting, from early detection to risk assessment.
The datasets generated for this manuscript are not readily available because of the principles and conditions set out in articles 6 (1) (e) and 9 (2) (j) of the General Data Protection Regulation (GDPR). National legal basis as per the Regulations on population-based health surveys and ethical approval from the Norwegian Regional Committee for Medical and Health Research Ethics (REC) is also required. Requests to access the datasets should be directed to the corresponding authors with a project proposal. Please refer to our project website for the latest information on data sharing (kreftregisteret.no/en/janusrna). Our scripts, plot data, and bioinformatics workflow files can be accessed from our Github repo (https://github.com/sinanugur/LCscripts).
Lung Cancer analyses scriptsGitHub.
- Hilde Langseth
- Trine B Rounge
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Human subjects: This study was approved by the Norwegian Regional Committee for medical and health researchethics (REC no: 19892 previous 2016/1290) and was based on broad consent from participants in the Janus cohort. The work has been carried out in compliance with the standards set by the Declaration of Helsinki.
- YM Dennis Lo, The Chinese University of Hong Kong, Hong Kong
© 2022, Umu et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.