Cell-free DNA (cfDNA) tests use small amounts of DNA in the bloodstream as biomarkers. While it is thought that cfDNA is largely released by dying cells, the proportion of dying cells’ DNA that reaches the bloodstream is unknown. Here we integrate estimates of cellular turnover rates to calculate the expected amount of cfDNA. By comparing this to the actual amount of cell type-specific cfDNA, we estimate the proportion of DNA reaching plasma as cfDNA. We demonstrate that <10% of the DNA from dying cells is detectable in plasma, and the ratios of measured to expected cfDNA levels vary a thousand-fold among cell types, often reaching well below 0.1%. The analysis suggests that local clearance, presumably via phagocytosis, takes up most of the dying cells’ DNA. Insights into the underlying mechanism may help to understand the physiological significance of cfDNA and improve the sensitivity of liquid biopsies.
This important study makes a bold step towards understanding what fraction of DNA that is liberated from different tissues in a healthy human is found in circulation as cell-free DNA. Unfortunately, the evidence for the conclusions is presently incomplete, but with additional controls, this could become a major achievement for reference in understanding changes in cell-free DNA in disease states.
The human body is in a constant state of cellular turnover, with an estimated 0.3×1012 cells being replaced daily, two-thirds of which are erythrocytes (Sender and Milo, 2021). Apoptosis is the primary mechanism of cell death in tissues with rapid turnover, in which cells are disposed of orderly and engulfed by phagocytic cells to recycle their resources.
Apoptosis involves the fragmentation of DNA through two distinct mechanisms. The first mechanism occurs within the apoptotic cell, where endonucleases break down chromatin into nucleosomal units. The second mechanism is carried out by phagocytic cells, which engulf and degrade the DNA of apoptotic cells in order to prevent the release of potentially immunogenic intracellular materials (Nagata, 2005).
It has been known since the late 1940s that cell-free DNA (cfDNA) fragments can be found in the circulation of healthy and diseased individuals (Bendich et al., 1965; Mandel and Metais, 1948). The fragments are typically nucleosome-size (165 base pairs), likely representing molecules that were not completely degraded during the process of cell death. With the advent of next-generation sequencing (NGS), cfDNA has become a clinically-useful biomarker for various applications. These include non-invasive prenatal testing (via detection of fetal chromosomal abnormalities via sampling of maternal cfDNA), cancer monitoring (via detection of oncogenic mutations in plasma, termed circulating tumor DNA [ctDNA]), and monitoring of allogeneic organ transplants (via detection of donor-derived SNPs in cfDNA)(Heitzer et al., 2019). Perhaps the greatest promise of cfDNA is in cancer diagnostics – a blood test that can allow early detection at an actionable stage, real-time assessment of treatment response, detection of recurrence, and identification of specific genetic mutations to inform treatment decisions (Jamshidi et al., 2022; Wan et al., 2017).
Regardless of somatic mutations, the mere presence of cfDNA from a given tissue is of great value as it often correlates with tissue-specific injury (Gala-Lopez et al., 2018; Heitzer et al., 2020; Lehmann-Werman et al., 2018, 2016; Zemmour et al., 2018). Multiple layers of epigenetic information allow the inference of the tissue origins of cfDNA. For example, the size, fragmentation patterns, and exact end position of cfDNA fragments, nucleosome positions reflected in the relative abundance of promoter sequences, and histone modification patterns all allow tracing cfDNA molecules to their tissue origin (Lo et al., 2021; Moss et al., 2018; Oberhofer et al., 2022; Zhou et al., 2022). One particularly sensitive approach is using DNA methylation patterns, a stable determinant of cell identity preserved on cfDNA. Deconvolution of cfDNA methylomes using a reference atlas of human cell type-specific methylomes has revealed various tissues’ relative and absolute contribution to cfDNA in health and disease. Under baseline conditions, over 90% of cfDNA originates in blood cells (neutrophils, megakaryocytes, monocytes, lymphocytes, and erythroblasts), with vascular endothelial cells and hepatocytes being the only solid tissue source (Loyfer et al., 2023). In both homeostatic and pathologic conditions, the exact mechanism by which cfDNA is released is not fully understood but is thought to involve cell death. Whether cfDNA can be released from cells that remain alive after the event is controversial (Stroun et al., 2001). Two striking examples of such a scenario are megakaryocytes and erythroblasts, whose physiological function is to release anuclear cells, namely platelets and erythrocytes (Moss et al., 2022).
From a practical perspective, the amount of cfDNA (typically ∼1000 genome equivalents per ml of plasma) is a major barrier to a sensitive diagnosis of diseases – particularly cancer - at an early stage. Beyond maximization of the volume of blood drawn and the number of markers tested in parallel, understanding and eventually manipulating the local release and systemic clearance of cfDNA hold great potential for improving the sensitivity of tests. For example, recent studies have suggested pharmacologic approaches for blocking the removal of cfDNA from the systemic circulation, leading to a transient elevation in cfDNA concentration (Tabrizi et al., 2023). The efficiency and determinants of local cfDNA release to circulation have not been examined.
In this study, we use recent estimates of cellular turnover rates (Sender and Milo, 2021) to calculate the expected amount of DNA resulting from cell death from each cell type at a given time. By comparing this to the amount of cell type-specific cfDNA present in the plasma, taking into account estimates of systemic cfDNA clearance rate, we estimate the fraction of DNA that reaches the plasma as cfDNA.
Materials and Methods
Normal cellular turnover data for all cell types except hepatocytes and megakaryocytes were obtained from estimates provided by Sender & Milo, 2021. We estimated the cellular turnover of megakaryocytes in two ways. First, using the number of megakaryocytes in the bone marrow (Harrison, 1962; Noetzli et al., 2019) maturation time of around five days (Machlus and Italiano, 2013). Second, using the production of platelets (Harker and Finch, 1969) and an average number of platelets produced per megakaryocyte (Harker and Finch, 1969; Kaufman et al., 1965; Trowbridge et al., 1984). Cellular turnover of hepatocytes was calculated based on (Heinke et al., 2022) by combining estimates for the number of cells and the death rate for the different ploidy groups (See Dataset 1).
Tissue-specific cfDNA concentration
The concentrations of plasma cfDNA derived from specific cell types were obtained from two studies that used deconvolution of the plasma methylome using a human cell type methylation atlas (Loyfer et al., 2023; Moss et al., 2018).
Estimation of the potential DNA flux
We estimated the potential cfDNA plasma levels if all the DNA from the dying cells had reached the bloodstream. Our estimate utilized the calculated cellular turnover rate and data regarding the ploidy of the cells, the volume of blood plasma, and the half-life of cfDNA molecules in the blood. For each cell type, we defined the cellular turnover in units of cells per day d and the ploidy (average number of sets of chromosomes) pc. We used blood plasma volume V = 3L (ICRP, 2002), the mean lifespan ofcfDNA molecules in the blood (Diehl et al., 2008; Lo et al., 1999; To et al., 2003; Yao et al., 2016) and haploid genome mass mh= 3.2⋅10−12 g (Piovesan et al., 2019)
The expected level of cfDNA levels was calculated according to the formula:
Where Xc is given in units of Genome equivalents/ml (Units of g/ml could be obtained by multiplicationby mh).
Intuitively, the potential cfDNA level is obtained by calculating the amount of DNA in dying cells at a given moment (defined by the mean lifespan of cfDNA in plasma) when considering the total volume of the plasma.
Ultimately, we compared the measured cfDNA levels to the potential DNA flux estimates to determine the fraction of DNA that reaches the blood.
Standard error was collected or calculated for each value used. In several cases, such as the half-life of plasma cfDNA, the value’s uncertainty was big and best described as a multiplication factor of error (uncertainty of a variable with lognormal distribution). To enable simple error propagation, we transformed all errors to be expressed in terms of multiplication error, approximating linear (normal) error by a lognormal error using the formulawhere μ, σ are the mean and standard error in the linear (normal) model. Thus, we model the uncertainty around the value μ as a random variable μ x with a lognormal distribution with a shape parameter of s = ln(ferror). By definition, the shape parameter describes the standard error of the exponential transformed random variable, defined as (which is distributed normally).
Error propagation of multiplication of two values with multiplication error was done analytically using the formula: . The formula is based on the fact that the multiplication of two lognormal variables also distributes lognormally with a shape parameter that is equal to the root of the sum of the squares of the original shape factors.
Error propagation for the summation of variables with non-linear uncertainty was calculated using bootstrapping by drawing 1000 samples from the distribution describing the uncertainties of the values.
Based on atlases of human cell type-specific methylation signatures, Moss et al. and Loyfer et al. analyzed the main cell types contributing to plasma cfDNA. They found the primary sources of plasma cfDNA to be blood cells: granulocytes, megakaryocytes, macrophages, and/or monocytes (the signature could not differentiate between the last two), lymphocytes, and erythrocyte progenitors. Other cells that had detectable contributions are endothelial cells and hepatocytes. Qualitatively, these cells represent most of the leading cell types in cellular turnover, as shown in Sender & Milo 2021 (Sender and Milo, 2021). Epithelial cells of the gastrointestinal tract, the lung, and the skin are other cell types that significantly contribute to cellular turnover. Dying cells in these tissues are shed into the gut lumen, the air spaces, or out of the skin (note that while DNA from gut and lung epithelial cells can be found in stool and bronchoalveolar lavage, the fate of DNA from skin cells is not known). This arrangement may explain why DNA from these cell types is not represented in plasma cfDNA in healthy conditions. Therefore, it appears that cells with high cfDNA plasma levels are those with relatively high turnover that is not being shed out of the body.
We used the cellular turnover estimates of these cell types to calculate the potential amount of DNA discarded from each cell type. We derived the potential levels of cfDNA in the plasma (see Methods). By comparing this data to the measured levels of cfDNA in plasma (Loyfer et al., 2023; Moss et al., 2018), we could calculate the fraction of potential DNA presented as cfDNA in the plasma, as illustrated in Figure 1. The results indicate that less than 4% of the DNA of dying cells reaches the plasma. The ratios of measured to expected cfDNA levels vary a thousand-fold, ranging from 1:30 (megakaryocytes and endothelial cells) to 1:3×104 (erythrocyte progenitors).
In general, around 1000 genome equivalents of cfDNA are found in the plasma in healthy individuals. The limit for detection of a cell type of specific origin depends on the assay in use. General essays using deconvolution have a sensitivity of around 1%, e.g., ∼10 genome equivalents (Loyfer et al., 2023). Targeted essays using markers for a specific cell type at a deep coverage can improve the sensitivity to around 0.1%, e.g.∼1, genome equivalent. The gradient in Figure 1 depicts this range of sensitivities. The low ratios of measured to potential cfDNA described for the mentioned cell types indicate that cells with lower cellular turnover, such as skeletal myocytes, adipocytes, and pancreatic beta cells, are not being detected in the plasma of healthy individuals because their plasma levels are lower than the sensitivities of existing essays. Notably, a comparison of potential cfDNA plasma levels of breast epithelial cells in healthy women to the limit of detection reveals the breast as an outlier. This might suggest that dying breast epithelial cells’ local DNA utilization mechanism is extremely efficient.
In this study, we report a surprising, dramatic discrepancy between the measured levels of cfDNA in the plasma and the potential DNA flux from dying cells. One hypothetical explanation for that discrepancy is the limited sensitivity of typical cfDNA assays to short DNA fragments, which may contribute a significant fraction of the overall cfDNA mass. Regular cfDNA analysis shows a size distribution concentrated around a length of 165 base pairs (bp). The sizes in ctDNA vary more, but most are longer than 100 bp (Alcaide et al., 2020; Udomruk et al., 2021). A recent study suggested a significant fraction of single-strand ultrashort fragments (length of 25-60 bp) (Cheng et al., 2022). However, the total amount of DNA contained in these fragments is less than that of the longer regular cfDNA fragments (Cheng et al., 2022), arguing against ultrashort fragments as an explanation for the “missing” cfDNA material.
An alternative hypothetical explanation is that most DNA of dying enters the bloodstream but is rapidly degraded or taken up. Accounting for the tissue-specific DNA concentration found in the blood, we can estimate the half-life of cfDNA in the bloodstream in that case, based on the cellular turnover rate. This calculation suggests that the half-life of cfDNA in the bloodstream should be only a few seconds to a few minutes. However, previous research using various methods (mostly the decay of fetal cfDNA in maternal plasma after birth) has shown that the half-life of DNA in the bloodstream ranges from 15-120 minutes, orders of magnitude higher than this estimate suggest (Diehl et al., 2008; Lo et al., 1999; To et al., 2003; Yao et al., 2016). In addition, a systemic clearance mechanism cannot explain the differential representation of cfDNA from different cell types relative to their turnover rate.
Therefore, the low fraction of DNA measured as cfDNA suggests that less than a few percent of the DNA from dying cells reaches the bloodstream. An unknown mechanism utilizes the rest with a tissue-specific efficiency. A potential explanation is that tissue-resident phagocytes are degrading the DNA after apoptosis, similar to what has been shown for extruded erythroblast nuclei (Yoshida et al., 2005). Studies in mice have revealed that lysosomal DNases in macrophages play a role in cleaving chromosomal DNA during apoptosis (McIlroy et al., 2000), as well as the degradation of erythroid nuclei in erythroblastic islands (Yoshida et al., 2005). The first DNA fragmentation occurs within the apoptotic cells, resulting in nucleosomal unit multiples of about 180 bp (Matassov et al., 2004). Therefore, up to 200-bp cfDNA fragments in plasma may indicate that phagocytes have not further degraded these fragments.
A comparison between the different types of cells shows a trend in which less DNA flux from cells with higher turnover gets to the bloodstream. In particular, a tiny fraction (1 in 3×104) of DNA from erythroid progenitors arrives at the plasma, indicating an extreme efficiency of the DNA recovery mechanism. Erythroid progenitors are arranged in erythroblastic islands. Up to a few tens of erythroid progenitors surround a single macrophage that collects the nuclei extruded during the erythrocyte maturation process (pyrenocytes) (Chasis and Mohandas, 2008). The amount of DNA discarded through the maturation of over 200 billion erythrocytes per day (Sender and Milo, 2021) exceeds all other sources of homeostatic discarded DNA. Our findings indicate that the organization of dedicated erythroblastic islands functions highly efficiently regarding DNA utilization. The overall trend of higher turnover resulting in a lower cfDNA to DNA flux ratio may indicate similar design principles, in which the utilization of DNA is better in tissues with higher turnover. However, our analysis is limited to only several cell types (due to cfDNA test and deconvolution sensitivities), and extrapolation to cells with lower cell turnover is problematic.
A thorough explanation for the gap between the estimated DNA flux from dying cells and the measured cfDNA data requires more research. Since macrophages play a prominent role in the phagocytosis of dying cells, we hypothesize that the local uptake of cfDNA by activated macrophages is responsible for the uptake of most DNA from dying cells. An interesting implication of this possibility is that cfDNA levels are expected to be highly sensitive to perturbations in the local clearance mechanism. In other words, elevated levels of cfDNA from a given cell type may represent a disruption of local macrophages rather than an actual increase in the rate of cell death.
Comparing the DNA flux involved in the homeostatic cellular turnover of specific cell types to the sensitivity of cfDNA essays reveals some current limitations in the field. Cell types such as adipocytes, cardiomyocytes, and pancreatic beta cells are not represented in the cfDNA of healthy individuals. Our analysis suggests that current essays need to be more sensitive to identify the minute amount of those. Moreover, the quantitative analysis can predict potential cell types with a non-neglectable contribution to the plasma cfDNA and allow their focused study. The current analysis call for a focus on breast epithelial cells and myocytes, as their potential cfDNA levels are relatively higher than the detection limit. Previous research regarding their contribution to cfDNA has used highly sensitive essays but found no contribution in healthy individuals (Loyfer et al., 2023; Moss et al., 2020, 2018). This might indicate a highly effective mechanism for the utilization of DNA from dying cells, for example, local degradation of myonuclei within the syncytium of skeletal muscle fibers.
Quantitative characterization of the abundance of macrophages concerning the cellular death rate in different tissues could improve our understanding of the DNA clearance mechanism and the role of phagocytes.
Illuminating the discrepancy between the dying cells’ DNA flux and the measured cfDNA levels may open the door for research with clinical potential. The sensitivity of the assays and the amount of available DNA limit the utility of liquid biopsy, particularly in early disease detection. Better characterization of the mechanism which limits available plasma cfDNA could lead to potential interventions that increase the fraction of DNA flux arriving at the plasma and thus improve the sensitivity of liquid biopsies based on cfDNA.
We thank Yinon Bar-On, Lior Greenspon, and Yuval Rosenberg for valuable feedback on this manuscript. Funding: This research was generously supported by the Mary and Tom Beck Canadian Center for Alternative Energy Research, the Schwartz-Reisman Collaborative Science Program, the Ullmann Family Foundation, and the Yotam Project (RM). This research was supported by grants from the Helmsley Charitable Trust, JDRF, NIDDK, and Grail (YD). Prof. Yuval Dor has filed patents on cfDNA analysis. Prof. Ron Milo is the Head of the Mary and Tom Beck Canadian Center for Alternative Energy Research and the Charles and Louise Gartner Professorial Chair incumbent.
Data and code availability
All study data are included in the article and Dataset S1. All code is available in Jupyter notebooks at https://gitlab.com/milo-lab-public/cfdna-and-cellular-turnover
- Evaluating the quantity, quality and size distribution of cell-free DNA by multiplex droplet digital PCRSci Rep 10
- CIRCULATING DNA AS A POSSIBLE FACTOR IN ONCOGENESISScience 148:374–376
- Erythroblastic islands: niches for erythropoiesisBlood 112:470–478
- Plasma contains ultrashort single-stranded DNA in addition to nucleosomal cell-free DNAiScience 25
- Circulating mutant DNA to assess tumor dynamicsNat Med 14:985–990
- Beta Cell Death by Cell-free DNA and Outcome After Clinical Islet TransplantationTransplantation 102:978–985
- Thrombokinetics in manJ Clin Invest 48:963–974
- The total cellularity of the bone marrow in manJ Clin Pathol 15:254–259
- Diploid hepatocytes drive physiological liver renewal in adult humansCell Syst 13:499–507
- Cell-Free DNA and Apoptosis: How Dead Cells Inform About the LivingTrends Mol Med 26:519–528
- Current and future perspectives of liquid biopsies in genomics-driven oncologyNat Rev Genet 20:71–88
- Basic Anatomical and Physiological Data for Use in Radiological Protection Reference Values (No. 89). ICRP
- Evaluation of cell-free DNA approaches for multi-cancer early detectionCancer Cell 40:1537–1549
- Circulating megakaryocytes and platelet release in the lungBlood 26:720–731
- Monitoring liver damage using hepatocyte-specific methylation markers in cell-free circulating DNAJCI Insight 3https://doi.org/10.1172/jci.insight.120687
- Identification oftissue-specific cell death using methylation patterns of circulating DNAProc Natl Acad Sci U S A 113:E1826–34
- A DNA methylation atlas of normal human cell typesNature 613:355–364
- Epigenetics, fragmentomics, and topology of cell-free DNA in liquid biopsiesScience 372https://doi.org/10.1126/science.aaw3616
- Rapid clearance of fetal DNA from maternal plasmaAm J Hum Genet 64:218–224
- The incredible journey: From megakaryocyte development to platelet formationJ Cell Biol 201:785–796
- [Nuclear Acids In Human Blood Plasma]C R Seances Soc Biol Fil 142:241–243
- Measurement of apoptosis by DNA fragmentationMethods Mol Biol 282:1–17
- An auxiliary mode of apoptotic DNA fragmentation provided by phagocytesGenes Dev 14:549–558
- Megakaryocyte and erythroblast DNA in plasma and plateletsbioRxiv https://doi.org/10.1101/2022.10.03.510502
- Comprehensive human cell-type methylation atlas reveals origins of circulating cell-free DNA in health and diseaseNat Commun 9
- Circulating breast-derived DNA allows universal detection and monitoring of localized breast cancerAnn Oncol 31:395–403
- DNA degradation in development and programmed cell deathAnnu Rev Immunol 23:853–875
- New Insights Into the Differentiation of Megakaryocytes From Hematopoietic ProgenitorsArterioscler Thromb Vasc Biol 39:1288–1300
- Tracing the Origin of Cell-Free DNA Molecules through Tissue-Specific Epigenetic SignaturesDiagnostics (Basel) 12https://doi.org/10.3390/diagnostics12081834
- On the length, weight and GC content of the human genomeBMC Res Notes 12
- The distribution of cellular turnover in the human bodyNat Med 27:45–48
- About the possible origin and mechanism of circulating DNA apoptosis and active DNA releaseClin Chim Acta 313:139–142
- An intravenous DNA-binding priming agent protects cell-free DNA and improves the sensitivity of liquid biopsiesbioRxiv https://doi.org/10.1101/2023.01.13.523947
- Rapid clearance of plasma Epstein-Barr virus DNA after surgical treatment of nasopharyngeal carcinomaClin Cancer Res 9:3254–3259
- The origin of platelet count and volumeClin Phys Physiol Meas 5:145–170
- Size distribution of cell-free DNA in oncologyCrit Rev Oncol Hematol 166
- Liquid biopsies come of age: towards implementation of circulating tumour DNANat Rev Cancer 17:223–238
- Evaluation and comparison of in vitro degradation kinetics of DNA in serum, urine and saliva: A qualitative studyGene 590:142–148
- Phosphatidylserine-dependent engulfment by macrophages of nuclei from erythroid precursor cellsNature 437:754–758
- Non-invasive detection of human cardiomyocyte death using methylation patterns of circulating DNANat Commun 9
- Epigenetic analysis of cell-free DNA by fragmentomic profilingProc Natl Acad Sci U S A 119