TY - JOUR TI - Unsupervised machine learning reveals risk stratifying glioblastoma tumor cells AU - Leelatian, Nalin AU - Sinnaeve, Justine AU - Mistry, Akshitkumar M AU - Barone, Sierra M AU - Brockman, Asa A AU - Diggins, Kirsten E AU - Greenplate, Allison R AU - Weaver, Kyle D AU - Thompson, Reid C AU - Chambless, Lola B AU - Mobley, Bret C AU - Ihrie, Rebecca A AU - Irish, Jonathan M A2 - Robles-Espinoza, C Daniela A2 - Cole, Philip A A2 - Robles-Espinoza, C Daniela A2 - Laffy, Julie VL - 9 PY - 2020 DA - 2020/06/23 SP - e56879 C1 - eLife 2020;9:e56879 DO - 10.7554/eLife.56879 UR - https://doi.org/10.7554/eLife.56879 AB - A goal of cancer research is to reveal cell subsets linked to continuous clinical outcomes to generate new therapeutic and biomarker hypotheses. We introduce a machine learning algorithm, Risk Assessment Population IDentification (RAPID), that is unsupervised and automated, identifies phenotypically distinct cell populations, and determines whether these populations stratify patient survival. With a pilot mass cytometry dataset of 2 million cells from 28 glioblastomas, RAPID identified tumor cells whose abundance independently and continuously stratified patient survival. Statistical validation within the workflow included repeated runs of stochastic steps and cell subsampling. Biological validation used an orthogonal platform, immunohistochemistry, and a larger cohort of 73 glioblastoma patients to confirm the findings from the pilot cohort. RAPID was also validated to find known risk stratifying cells and features using published data from blood cancer. Thus, RAPID provides an automated, unsupervised approach for finding statistically and biologically significant cells using cytometry data from patient samples. KW - machine learning KW - brain tumors KW - phoshpo-proteins KW - single cell KW - glioblastoma KW - mass cytomtery JF - eLife SN - 2050-084X PB - eLife Sciences Publications, Ltd ER -