Introduction

The human body is in a constant state of cellular turnover, with an estimated 0.3×1012 cells being replaced daily, two-thirds of which are erythrocytes (Sender and Milo, 2021). Apoptosis is the primary mechanism of cell death in tissues with rapid turnover, in which cells are disposed of orderly and engulfed by phagocytic cells to recycle their resources.

Apoptosis involves the fragmentation of DNA through two distinct mechanisms. The first mechanism occurs within the apoptotic cell, where endonucleases break down chromatin into nucleosomal units. The second mechanism is carried out by phagocytic cells, which engulf and degrade the DNA of apoptotic cells in order to prevent the release of potentially immunogenic intracellular materials (Nagata, 2005).

It has been known since the late 1940s that cell-free DNA (cfDNA) fragments can be found in the circulation of healthy and diseased individuals (Bendich et al., 1965; Mandel and Metais, 1948). The fragments are typically nucleosome-size (165 base pairs), likely representing molecules that were not completely degraded during the process of cell death. With the advent of next-generation sequencing (NGS), cfDNA has become a clinically-useful biomarker for various applications. These include non-invasive prenatal testing (via detection of fetal chromosomal abnormalities via sampling of maternal cfDNA), cancer monitoring (via detection of oncogenic mutations in plasma, termed circulating tumor DNA [ctDNA]), and monitoring of allogeneic organ transplants (via detection of donor-derived SNPs in cfDNA)(Heitzer et al., 2019). Perhaps the greatest promise of cfDNA is in cancer diagnostics – a blood test that can allow early detection at an actionable stage, real-time assessment of treatment response, detection of recurrence, and identification of specific genetic mutations to inform treatment decisions (Jamshidi et al., 2022; Wan et al., 2017).

Regardless of somatic mutations, the mere presence of cfDNA from a given tissue is of great value as it often correlates with tissue-specific injury (Gala-Lopez et al., 2018; Heitzer et al., 2020; Lehmann-Werman et al., 2018, 2016; Zemmour et al., 2018). Multiple layers of epigenetic information allow the inference of the tissue origins of cfDNA. For example, the size, fragmentation patterns, and exact end position of cfDNA fragments, nucleosome positions reflected in the relative abundance of promoter sequences, and histone modification patterns all allow tracing cfDNA molecules to their tissue origin (Lo et al., 2021; Moss et al., 2018; Oberhofer et al., 2022; Zhou et al., 2022). One particularly sensitive approach is using DNA methylation patterns, a stable determinant of cell identity preserved on cfDNA. Deconvolution of cfDNA methylomes using a reference atlas of human cell type-specific methylomes has revealed various tissues’ relative and absolute contribution to cfDNA in health and disease. Under baseline conditions, over 90% of cfDNA originates in blood cells (neutrophils, megakaryocytes, monocytes, lymphocytes, and erythroblasts), with vascular endothelial cells and hepatocytes being the only solid tissue source (Loyfer et al., 2023). In both homeostatic and pathologic conditions, the exact mechanism by which cfDNA is released is not fully understood but is thought to involve cell death. Whether cfDNA can be released from cells that remain alive after the event is controversial (Stroun et al., 2001). Two striking examples of such a scenario are megakaryocytes and erythroblasts, whose physiological function is to release anuclear cells, namely platelets and erythrocytes (Moss et al., 2022).

From a practical perspective, the amount of cfDNA (typically ∼1000 genome equivalents per ml of plasma) is a major barrier to a sensitive diagnosis of diseases – particularly cancer - at an early stage. Beyond maximization of the volume of blood drawn and the number of markers tested in parallel, understanding and eventually manipulating the local release and systemic clearance of cfDNA hold great potential for improving the sensitivity of tests. For example, recent studies have suggested pharmacologic approaches for blocking the removal of cfDNA from the systemic circulation, leading to a transient elevation in cfDNA concentration (Tabrizi et al., 2023). The efficiency and determinants of local cfDNA release to circulation have not been examined.

In this study, we use recent estimates of cellular turnover rates (Sender and Milo, 2021) to calculate the expected amount of DNA resulting from cell death from each cell type at a given time. By comparing this to the amount of cell type-specific cfDNA present in the plasma, taking into account estimates of systemic cfDNA clearance rate, we estimate the fraction of DNA that reaches the plasma as cfDNA.

Materials and Methods

Cellular turnover

Normal cellular turnover data for all cell types except hepatocytes and megakaryocytes were obtained from estimates provided by Sender & Milo, 2021. We estimated the cellular turnover of megakaryocytes in two ways. First, using the number of megakaryocytes in the bone marrow (Harrison, 1962; Noetzli et al., 2019) maturation time of around five days (Machlus and Italiano, 2013). Second, using the production of platelets (Harker and Finch, 1969) and an average number of platelets produced per megakaryocyte (Harker and Finch, 1969; Kaufman et al., 1965; Trowbridge et al., 1984). Cellular turnover of hepatocytes was calculated based on (Heinke et al., 2022) by combining estimates for the number of cells and the death rate for the different ploidy groups (See Dataset 1).

Tissue-specific cfDNA concentration

The concentrations of plasma cfDNA derived from specific cell types were obtained from two studies that used deconvolution of the plasma methylome using a human cell type methylation atlas (Loyfer et al., 2023; Moss et al., 2018).

Estimation of the potential DNA flux

We estimated the potential cfDNA plasma levels if all the DNA from the dying cells had reached the bloodstream. Our estimate utilized the calculated cellular turnover rate and data regarding the ploidy of the cells, the volume of blood plasma, and the half-life of cfDNA molecules in the blood. For each cell type, we defined the cellular turnover in units of cells per day d and the ploidy (average number of sets of chromosomes) pc. We used blood plasma volume V = 3L (ICRP, 2002), the mean lifespan ofcfDNA molecules in the blood (Diehl et al., 2008; Lo et al., 1999; To et al., 2003; Yao et al., 2016) and haploid genome mass mh= 3.2⋅10−12 g (Piovesan et al., 2019)

The expected level of cfDNA levels was calculated according to the formula:

Where Xc is given in units of Genome equivalents/ml (Units of g/ml could be obtained by multiplicationby mh).

Intuitively, the potential cfDNA level is obtained by calculating the amount of DNA in dying cells at a given moment (defined by the mean lifespan of cfDNA in plasma) when considering the total volume of the plasma.

Ultimately, we compared the measured cfDNA levels to the potential DNA flux estimates to determine the fraction of DNA that reaches the blood.

Uncertainties estimate

Standard error was collected or calculated for each value used. In several cases, such as the half-life of plasma cfDNA, the value’s uncertainty was big and best described as a multiplication factor of error (uncertainty of a variable with lognormal distribution). To enable simple error propagation, we transformed all errors to be expressed in terms of multiplication error, approximating linear (normal) error by a lognormal error using the formulawhere μ, σ are the mean and standard error in the linear (normal) model. Thus, we model the uncertainty around the value μ as a random variable μ x with a lognormal distribution with a shape parameter of s = ln(ferror). By definition, the shape parameter describes the standard error of the exponential transformed random variable, defined as (which is distributed normally).

Error propagation of multiplication of two values with multiplication error was done analytically using the formula: . The formula is based on the fact that the multiplication of two lognormal variables also distributes lognormally with a shape parameter that is equal to the root of the sum of the squares of the original shape factors.

Error propagation for the summation of variables with non-linear uncertainty was calculated using bootstrapping by drawing 1000 samples from the distribution describing the uncertainties of the values.

Results

Based on atlases of human cell type-specific methylation signatures, Moss et al. and Loyfer et al. analyzed the main cell types contributing to plasma cfDNA. They found the primary sources of plasma cfDNA to be blood cells: granulocytes, megakaryocytes, macrophages, and/or monocytes (the signature could not differentiate between the last two), lymphocytes, and erythrocyte progenitors. Other cells that had detectable contributions are endothelial cells and hepatocytes. Qualitatively, these cells represent most of the leading cell types in cellular turnover, as shown in Sender & Milo 2021 (Sender and Milo, 2021). Epithelial cells of the gastrointestinal tract, the lung, and the skin are other cell types that significantly contribute to cellular turnover. Dying cells in these tissues are shed into the gut lumen, the air spaces, or out of the skin (note that while DNA from gut and lung epithelial cells can be found in stool and bronchoalveolar lavage, the fate of DNA from skin cells is not known). This arrangement may explain why DNA from these cell types is not represented in plasma cfDNA in healthy conditions. Therefore, it appears that cells with high cfDNA plasma levels are those with relatively high turnover that is not being shed out of the body.

We used the cellular turnover estimates of these cell types to calculate the potential amount of DNA discarded from each cell type. We derived the potential levels of cfDNA in the plasma (see Methods). By comparing this data to the measured levels of cfDNA in plasma (Loyfer et al., 2023; Moss et al., 2018), we could calculate the fraction of potential DNA presented as cfDNA in the plasma, as illustrated in Figure 1. The results indicate that less than 4% of the DNA of dying cells reaches the plasma. The ratios of measured to expected cfDNA levels vary a thousand-fold, ranging from 1:30 (megakaryocytes and endothelial cells) to 1:3×104 (erythrocyte progenitors).

cfDNA as a fraction of homeostatic cell turnover.

Estimates for the DNA flux from homoeostatic cell turnover were made based on Sender & Milo, 2021 and converted to units of potential cfDNA plasma concentration (see Methods). Cell types are ordered by their estimated cellular turnover. Empty markers represent cell types being shed out of the body upon turnover. Observed levels of cfDNA (Loyfer et al., 2023; Moss et al., 2018) are shown for cell types found in the plasma, with labels depicting the ratio between potential and measured cfDNA concentrations. Circles, potential representation in cfDNA; diamonds, observed concentration of cfDNA. The assay’s detection limit is presented by a gradient (from around ten genome equivalents for deconvolution assay to around one genome equivalent for targeted assays). Error bars represent the standard error of the mean of the uncertainty, approximated by a lognormal distribution (see Methods). Uncertainty regarding the measured cfDNA is smaller than the marker size

In general, around 1000 genome equivalents of cfDNA are found in the plasma in healthy individuals. The limit for detection of a cell type of specific origin depends on the assay in use. General essays using deconvolution have a sensitivity of around 1%, e.g., ∼10 genome equivalents (Loyfer et al., 2023). Targeted essays using markers for a specific cell type at a deep coverage can improve the sensitivity to around 0.1%, e.g.∼1, genome equivalent. The gradient in Figure 1 depicts this range of sensitivities. The low ratios of measured to potential cfDNA described for the mentioned cell types indicate that cells with lower cellular turnover, such as skeletal myocytes, adipocytes, and pancreatic beta cells, are not being detected in the plasma of healthy individuals because their plasma levels are lower than the sensitivities of existing essays. Notably, a comparison of potential cfDNA plasma levels of breast epithelial cells in healthy women to the limit of detection reveals the breast as an outlier. This might suggest that dying breast epithelial cells’ local DNA utilization mechanism is extremely efficient.

Discussion

In this study, we report a surprising, dramatic discrepancy between the measured levels of cfDNA in the plasma and the potential DNA flux from dying cells. One hypothetical explanation for that discrepancy is the limited sensitivity of typical cfDNA assays to short DNA fragments, which may contribute a significant fraction of the overall cfDNA mass. Regular cfDNA analysis shows a size distribution concentrated around a length of 165 base pairs (bp). The sizes in ctDNA vary more, but most are longer than 100 bp (Alcaide et al., 2020; Udomruk et al., 2021). A recent study suggested a significant fraction of single-strand ultrashort fragments (length of 25-60 bp) (Cheng et al., 2022). However, the total amount of DNA contained in these fragments is less than that of the longer regular cfDNA fragments (Cheng et al., 2022), arguing against ultrashort fragments as an explanation for the “missing” cfDNA material.

An alternative hypothetical explanation is that most DNA of dying enters the bloodstream but is rapidly degraded or taken up. Accounting for the tissue-specific DNA concentration found in the blood, we can estimate the half-life of cfDNA in the bloodstream in that case, based on the cellular turnover rate. This calculation suggests that the half-life of cfDNA in the bloodstream should be only a few seconds to a few minutes. However, previous research using various methods (mostly the decay of fetal cfDNA in maternal plasma after birth) has shown that the half-life of DNA in the bloodstream ranges from 15-120 minutes, orders of magnitude higher than this estimate suggest (Diehl et al., 2008; Lo et al., 1999; To et al., 2003; Yao et al., 2016). In addition, a systemic clearance mechanism cannot explain the differential representation of cfDNA from different cell types relative to their turnover rate.

Therefore, the low fraction of DNA measured as cfDNA suggests that less than a few percent of the DNA from dying cells reaches the bloodstream. An unknown mechanism utilizes the rest with a tissue-specific efficiency. A potential explanation is that tissue-resident phagocytes are degrading the DNA after apoptosis, similar to what has been shown for extruded erythroblast nuclei (Yoshida et al., 2005). Studies in mice have revealed that lysosomal DNases in macrophages play a role in cleaving chromosomal DNA during apoptosis (McIlroy et al., 2000), as well as the degradation of erythroid nuclei in erythroblastic islands (Yoshida et al., 2005). The first DNA fragmentation occurs within the apoptotic cells, resulting in nucleosomal unit multiples of about 180 bp (Matassov et al., 2004). Therefore, up to 200-bp cfDNA fragments in plasma may indicate that phagocytes have not further degraded these fragments.

A comparison between the different types of cells shows a trend in which less DNA flux from cells with higher turnover gets to the bloodstream. In particular, a tiny fraction (1 in 3×104) of DNA from erythroid progenitors arrives at the plasma, indicating an extreme efficiency of the DNA recovery mechanism. Erythroid progenitors are arranged in erythroblastic islands. Up to a few tens of erythroid progenitors surround a single macrophage that collects the nuclei extruded during the erythrocyte maturation process (pyrenocytes) (Chasis and Mohandas, 2008). The amount of DNA discarded through the maturation of over 200 billion erythrocytes per day (Sender and Milo, 2021) exceeds all other sources of homeostatic discarded DNA. Our findings indicate that the organization of dedicated erythroblastic islands functions highly efficiently regarding DNA utilization. The overall trend of higher turnover resulting in a lower cfDNA to DNA flux ratio may indicate similar design principles, in which the utilization of DNA is better in tissues with higher turnover. However, our analysis is limited to only several cell types (due to cfDNA test and deconvolution sensitivities), and extrapolation to cells with lower cell turnover is problematic.

A thorough explanation for the gap between the estimated DNA flux from dying cells and the measured cfDNA data requires more research. Since macrophages play a prominent role in the phagocytosis of dying cells, we hypothesize that the local uptake of cfDNA by activated macrophages is responsible for the uptake of most DNA from dying cells. An interesting implication of this possibility is that cfDNA levels are expected to be highly sensitive to perturbations in the local clearance mechanism. In other words, elevated levels of cfDNA from a given cell type may represent a disruption of local macrophages rather than an actual increase in the rate of cell death.

Comparing the DNA flux involved in the homeostatic cellular turnover of specific cell types to the sensitivity of cfDNA essays reveals some current limitations in the field. Cell types such as adipocytes, cardiomyocytes, and pancreatic beta cells are not represented in the cfDNA of healthy individuals. Our analysis suggests that current essays need to be more sensitive to identify the minute amount of those. Moreover, the quantitative analysis can predict potential cell types with a non-neglectable contribution to the plasma cfDNA and allow their focused study. The current analysis call for a focus on breast epithelial cells and myocytes, as their potential cfDNA levels are relatively higher than the detection limit. Previous research regarding their contribution to cfDNA has used highly sensitive essays but found no contribution in healthy individuals (Loyfer et al., 2023; Moss et al., 2020, 2018). This might indicate a highly effective mechanism for the utilization of DNA from dying cells, for example, local degradation of myonuclei within the syncytium of skeletal muscle fibers.

Quantitative characterization of the abundance of macrophages concerning the cellular death rate in different tissues could improve our understanding of the DNA clearance mechanism and the role of phagocytes.

Illuminating the discrepancy between the dying cells’ DNA flux and the measured cfDNA levels may open the door for research with clinical potential. The sensitivity of the assays and the amount of available DNA limit the utility of liquid biopsy, particularly in early disease detection. Better characterization of the mechanism which limits available plasma cfDNA could lead to potential interventions that increase the fraction of DNA flux arriving at the plasma and thus improve the sensitivity of liquid biopsies based on cfDNA.

Acknowledgements

We thank Yinon Bar-On, Lior Greenspon, and Yuval Rosenberg for valuable feedback on this manuscript. Funding: This research was generously supported by the Mary and Tom Beck Canadian Center for Alternative Energy Research, the Schwartz-Reisman Collaborative Science Program, the Ullmann Family Foundation, and the Yotam Project (RM). This research was supported by grants from the Helmsley Charitable Trust, JDRF, NIDDK, and Grail (YD). Prof. Yuval Dor has filed patents on cfDNA analysis. Prof. Ron Milo is the Head of the Mary and Tom Beck Canadian Center for Alternative Energy Research and the Charles and Louise Gartner Professorial Chair incumbent.

Data and code availability

All study data are included in the article and Dataset S1. All code is available in Jupyter notebooks at https://gitlab.com/milo-lab-public/cfdna-and-cellular-turnover