The cell-free DNA (cfDNA) pool contains a mixture of fragments from different sources such as tumor cells and background (mainly cells of hematopoietic origin). After performing paired-end …
(a) sWGS fragment length histograms for 86 prostate cancer patients; colors reflect ctDNA fractions estimated from driver variant allele fractions obtained from targeted sequencing performed on the …
Circulating tumor DNA (ctDNA) fractions were determined either based on copy-number variants (‘ichorcna’) or using variant allele fractions (‘vaf’) of putative driver variants identified using deep, …
NMF was run with different numbers of components (x-axis) and for each fitted model, the maximum correlation between the summed weights of any subset of components and the ctDNA% (VAF-based) obtained.
Non-negative matrix factorization (NMF) signatures were estimated for random subsets of samples of different sizes on the shallow whole-genome sequencing (sWGS) data from the metastatic …
The shallow whole-genome sequencing (sWGS) dataset from the mCRPC cohort was randomly split into halves multiple times. For each partitioning, NMF trained on one half of the data (‘Train’) was used …
The figure shows which datasets and analyses were used to produce each of the 6 panels in Figure 2.
(a) sWGS fragment length histograms for the 533 DELFI samples; colors indicate case-control status of the sample. (b) Fragment length signatures inferred using NMF with two components on the sWGS …
(a) The distribution of Signature #1 weights for different cancer types. (b) ROC curve for cancer vs control classification using Signature #1.
Two component NMF models were trained separately for each cancer type by combining samples for that cancer type with healthy controls (a) AUCs for discriminating cases vs controls across cancer …
Each subplot shows an individual with at least three samples. The days on the x-axis are relative to the operation date. Top facet shows the variant allele frequency of the EGFR or ERBB2 mutation …
Fragment length distributions in mCRPC cohort.
Sheet 1 contains raw fragment length distributions from WGS data along with ctDNA% estimates. Sheet 2 contains raw fragment length distributions from targeted data.
Source code and data to produce Figure 2.
Source code and data to produce Figure 3.
Scripts used for the analysis of the DELFI data.
This includes a script to train NMF, a script to estimate the weight of NMF components and a script to train and evaluate a linear SVM model.