Single-cell modeling of routine clinical blood tests reveals transient dynamics of human response to blood loss
Abstract
Low blood count is a fundamental disease state and is often an early sign of illnesses including infection, cancer, and malnutrition, but our understanding of the homeostatic response to blood loss is limited, in part by coarse interpretation of blood measurements. Many common clinical blood tests actually include thousands of single-cell measurements. We present an approach for modeling the unsteady-state population dynamics of the human response to controlled blood loss using these clinical measurements of single-red blood cell (RBC) volume and hemoglobin. We find that the response entails (1) increased production of new RBCs earlier than is currently detectable clinically and (2) a previously unrecognized decreased RBC turnover. Both component responses offset the loss of blood. The model provides a personalized dimensionless ratio that quantifies the balance between increased production and delayed clearance for each individual and may enable earlier detection of both blood loss and the response it elicits.
Introduction
Single-cell measurements and models promise to capture important biological heterogeneity and reveal novel mechanisms (Baron et al., 2018; Giustacchini et al., 2017; Shalek et al., 2013; Tusi et al., 2018). Routine clinical blood tests already include single-cell measurements of cellular, nuclear, and cytoplasmic morphology and some single-cell protein concentrations (Chaudhury et al., 2017; Higgins and Mahadevan, 2010; Kim and Ornstein, 1983; Mohandas et al., 1986). These clinical assays measure fewer states per cell (~1–10) than more recently developed single-cell molecular methods (>1000) (Shalek et al., 2013; Tusi et al., 2018), but these clinical data have three strengths for modeling: (1) the low-dimensional state space is densely sampled, (2) existing mechanistic understanding of single-cell trajectories in this state space can guide specification of dynamic equations, and (3) there is a shorter path to clinical translation of any potential insights. The typical adult human produces about 2 million RBCs per second, with a similar rate of clearance of old RBCs after they have circulated for ~90–120 days. RBC lifespan is tightly controlled within each person but varies from one person to the next (Cohen et al., 2008; Malka et al., 2014). The volume of a typical RBC decreases by about 30% and the hemoglobin mass by about 20% over the course of the RBC’s lifespan, with the average hemoglobin concentration ([Hb]) increasing modestly (Malka et al., 2014; Willekens et al., 2008). Routine complete blood counts (CBCs) can include measurements of single-cell volume (v) and hemoglobin (h) for ~50,000 individual RBCs (Figure 1). Some of the youngest RBCs (‘reticulocytes’ <~3 days old) can be identified in these counts because they generally have RNA remnants in their membranes (d'Onofrio et al., 1995). The typical healthy RBC follows a (v,h)-trajectory along the major axis of the (v,h) distribution (u in Figure 1) as it ages until eventually being cleared in the lower left (low u). Static averages of marginal v and h distributions and other bulk blood characteristics are essential components of modern clinical diagnosis: HGB (hemoglobin concentration per unit volume blood), hematocrit (HCT, volume fraction of RBCs), mean RBC volume (MCV), mean RBC hemoglobin mass (MCH), mean RBC hemoglobin concentration (MCHC), and the coefficient of variation in RBC volume (red cell distribution width or RDW). The ~100,000 single-cell measurements in each routine CBC do not currently directly inform clinical care, but they have great potential to do so. Anemia (low HGB or HCT) (Beutler and Waalen, 2006) is associated with almost all major diseases including cancer, infection, heart failure, autoimmune disease, and malnutrition, and is often the first sign of many of these major illnesses. Understanding the single-cell dynamics of the homeostatic response to blood loss will provide insight into the development and progression of many diseases and enhance our ability to diagnose, monitor, and intervene most effectively.

Unsteady-state modeling of single-RBC volume and hemoglobin dynamics.
(A) Routine complete blood counts (CBC) measure the single-cell volume and hemoglobin for young RBCs (b(v,h,t), blue contours showing ‘reticulocytes’ that are <~3 days old) as well as for all circulating RBCs (P(v,h,t), red contours showing RBCs of all ages from 0 up to 90–120 days, with RBC lifespan well-controlled in each person but varying from one person to the next). The black line through the origin shows the mean hemoglobin concentration (mean corpuscular hemoglobin concentration, MCHC) for the sampled population and this major axis of the distribution (u) provides a very rough estimate of RBC age, with higher u corresponding to younger age. (B) Schematic of the model of single-RBC volume-hemoglobin dynamics. Individual RBCs are produced as reticulocytes (RET) in the top right and lose about 30% of their volume and about 20% of their hemoglobin during their 90–120 day lifespan, with volume and hemoglobin reductions occurring during an early fast phase parameterized by βv and βh and a later slow phase parameterized by α, with fluctuations in rates of single-RBC volume and hemoglobin change quantified by Dv and Dh. As the single-RBC volume and hemoglobin continue to fall, the probability of clearance increases dramatically as the RBC’s trajectory approaches the boundary region shown as vc. (C) Four measurements were made to establish each subject’s baseline before controlled blood loss. Additional measurements were made 1–3 days and 21 days later. (D) The modeling integrated serial CBCs into the parameter estimation process in a piecewise manner. The first CBC (left) is assumed to be at steady state, and the model is used to estimate dynamic parameters which produce RBC1 given RET1. These model parameters and RET1 are then used to estimate the initial condition leading to timepoint t2, and the model estimates the dynamics between timepoints t1 and t2. These steps for timepoint t2 are then repeated to estimate the transient dynamics between each successive timepoint. LS refers to the lifespan of RBCs. Panels (E–F) are frames from Video 1 that shows a simulation of the evolution of P(v,h,t) from t = 0 to t = 105 days for a typical study subject. Equal-probability contours for P(v,h) are shown at the bottom, with the empirical measurement as blue lines, and the simulation in solid red. The surface plot also shows the simulated P(v,h,t). The plot of the empirical measurement in dashed blue is serially updated during the movie to the measurement subsequent to the value of t. Marginal P(v,t) and P(h,t) are shown on the left and right.
Results
RBC population dynamics can be approximated with a semi-mechanistic unsteady-state mathematical model of RBC volume and hemoglobin and routine CBCs
A routine CBC samples the two-dimensional single-RBC volume-hemoglobin distribution (P(v,h,t)) in a patient’s circulation at time t (Figure 1). The composition of the circulating RBC population is determined by dynamic processes: production (erythropoiesis) (Bunn, 2013), maturation and aging over a ~ 100-day lifespan (Willekens et al., 2008), and clearance (Franco, 2009). Master equations are often used to model multi-dimensional probability distributions of single-cell states (Van Kampen, 2007). In the case of RBCs, P(v,h,t) is determined by a time-dependent production term (b(v,h,t)), dynamics, and a clearance term (d(v,h,t)). Each routine CBC with a reticulocyte count provides an estimate of both b(v,h,t) and P(v,h,t). The dynamics of P(v,h) can be modeled as a drift-diffusion process (), and the functional specification of the drift, diffusion, and clearance terms can be guided by existing knowledge of in vivo RBC volume and hemoglobin dynamics (Bosman et al., 2008; Franco, 2009; Gifford et al., 2006; Waugh et al., 1992; Willekens et al., 2008). This overall methodology has also been applied recently to many single-cell gene expression data sets (Shalek et al., 2013; Tusi et al., 2018) and has several strengths when applied to this clinical data: (1) (v,h) space is sampled far more densely than gene expression space, (2) (b(v,h,t)) can be directly sampled with each CBC, (3) rich existing physiologic knowledge of the dynamics of (v,h) can guide the functional form of (Lew et al., 1995; Waugh et al., 1992; Higgins and Mahadevan, 2010), (4) b(v,h,t) and P(v,h,t) can be repeatedly sampled more frequently (minutes) than the characteristic timescale in the system (~100 day RBC lifespan), and (5) inferred single-cell trajectories can easily be combined with electronic medical record data to understand phenotypic effects of dynamics and feedback.
We investigated RBC population dynamics in a cohort of 28 healthy individuals at baseline and following controlled blood loss. We describe the evolution of P(v,h,t) with the following equation:
Prior analysis under the assumption of steady state found that the drift term can be approximated as a function of the RBC’s current (v,h) with an early fast phase of volume and hemoglobin reduction during which the hemoglobin concentration ([Hb]) of young RBCs approaches the population mean (Higgins and Mahadevan, 2010). This fast phase is parameterized by βv and βh and is followed by a slower phase of coordinated volume and hemoglobin reduction parameterized by α. (See Figure 1 and details in Materials and methods.) The diffusive term is assumed constant without interaction and encapsulates the variation in the rates of volume and hemoglobin change from one RBC to the next and for the same RBC over time. Based on prior work (Higgins and Mahadevan, 2010; Patel et al., 2015), the clearance term is approximated as a function of the RBC’s current (v,h) and a parameter (vc) for a clearance boundary region (see Figure 1 for a schematic).
The homeostatic response to 10% loss of blood volume includes both an increase in RBC production and a delay in RBC clearance
We studied the effect of blood loss on transient RBC population dynamics by collecting one unit of blood (~10% blood loss) from each subject and estimating model parameters before and after. Significant blood loss triggers a rapid acellular fluid shift to restore intravascular volume that can be detected as a decrease in HCT or HGB. See Figure 2. RBCs are assumed to be lost in a volume- and hemoglobin-independent fashion, meaning that P(v,h,t) is not directly altered (Figure 2A). This assumption is based on prior labeling studies which model the residual lifespan of labeled RBCs (after reinfusion and recollection) to infer that a blood draw is a random sample of RBCs of all ages (Franco, 2009; Franco et al., 2013; Khera et al., 2013; Shrestha et al., 2016). The evidence for this assumption is indirect, relying on models of RBC lifespan distributions, and definitive establishment of its validity awaits the development of an accepted direct measurement or marker of RBC age. An individual can compensate for blood loss by increasing the rate of RBC production or by reducing the rate of clearance, or both. Production and clearance have baseline rates of ~1% per day (Dornhorst, 1951; Franco et al., 2013). Under physiologic conditions, only the oldest RBCs are cleared (Cohen et al., 2008; Franco, 2009; Franco et al., 2013; Khera et al., 2013). The gold standard 'reticulocyte count' does not reliably detect increased production for about 5 days (Jelkmann and Lundby, 2011; Piva et al., 2015; Sieff, 2017) (Figure 2A), but the true production rate may increase earlier, and even less is known about any modulation of RBC clearance (Higgins and Mahadevan, 2010; Malka et al., 2014; Patel et al., 2015).

RBC dynamics are more sensitive to blood loss than RBC population statistics.
(A) Complete blood count (CBC) statistics for 28 healthy subjects before (0), 1–3 days after blood loss (+1), and 21 days after blood loss (+21). Intensive quantities (HGB, concentration of hemoglobin per unit volume of blood; HCT, volume fraction of RBCs in the blood) change significantly immediately following blood loss due to fluid shift, but single-RBC population statistics do not change significantly. MCV, mean RBC volume; RDW, coefficient of variation in RBC volume; MCHC, mean RBC hemoglobin concentration; rFraction, percentage of identified reticulocytes. See Figure 2—figure supplement 1 for rMCV, mean reticulocyte volume; rRDW, coefficient of variation in reticulocyte volume; MCH, mean RBC hemoglobin mass; CHDW, coefficient of variation in single-RBC hemoglobin concentration. By 21 days after blood loss, the CHDW and rFraction have increased significantly relative to baseline. MCHC at 21 days has decreased relative to 1–3 days. See main text and supplementary information for more detail. (B) Single-RBC volume and hemoglobin dynamics show significant change soon after blood loss. α and Dv increase significantly, and vc drops. (p1 compares +0 with +1, p2 compares +1 and +21, p3 compares 0 and +21.) Boxplots show the median (middle horizontal line), the 25th and 75th percentiles, and whiskers extend to data points not more than 1.5-times the interquartile range from the median. Notches show a 95% confidence interval for the median, and any additional outliers are shown as discrete points.
-
Figure 2—source data 1
Source data for boxplots in Figure 2.
- https://cdn.elifesciences.org/articles/48590/elife-48590-fig2-data1-v1.xlsx
Over the first 1–3 days following blood loss, the single-cell (v,h) dynamics for most subjects showed significant increases in model parameters α and Dv and a decrease in vc (Figure 2B). Greater α reflects a faster reduction in (v,h) for the typical RBC or a longer RBC lifespan, since α is normalized by a nominal lifespan, or both. Greater Dv reflects increased variation in the rate of RBC volume reduction, or a longer RBC lifespan, or both. Smaller vc reflects delayed clearance of RBCs with (v,h) low enough to have been cleared prior to blood loss.
Model simulation identifies regions of P(v,h) where the blood loss response causes the largest changes (Figure 3A): increase in the low-u region containing older cells, milder increase in the high-u, low-[Hb] region containing young RBCs, and a balancing decrease along the u axis above the low tail. We can quantify the empirical effect of blood loss response on the older cell fraction by integrating P(u) one standard deviation below the median and lower. Figure 4D shows a significant increase in the fraction of older RBCs for most subjects during the first 1–3 days after blood loss, consistent with a delayed clearance.

Model simulations show that blood loss causes a shift of probability density from the central axis of the (v,h) distribution, mostly to the low volume-low hemoglobin tail.
Comparison of the absolute (A) and relative (B) changes in the simulated single-RBC volume-hemoglobin probability density when setting Dv’=4Dv, α’=2α, and vc’=0.9vc, to match the median changes shown in Figure 2B. (C) Arrows depict the typical movement in probability density 1–3 days after blood loss. (D–F) show the effects of isolated changes to individual parameters, with changes to α and vc corresponding to retention of older RBCs (delayed clearance), and changes to Dv adding density in the high-volume, low-hemoglobin region where new RBCs appear, corresponding, in part, to increased production.

Single-cell model provides a mechanistic link between dynamics of the (v,h) distribution and the balance between increased RBC production and delayed RBC clearance in response to blood loss.
(A) Schematic of the single-cell volume-hemoglobin distribution for RBCs. The major axis of the distribution (u) corresponds to the mean single-RBC hemoglobin concentration (MCHC). An RBC’s position when projected onto u corresponds roughly to its age, with younger RBCs generally appearing in the upper right, and aging along the u axis toward the origin in the bottom left. We can compare changes in the fraction of older RBCs by integrating density along u as shown in the inset in the top left. We can compare changes in the fraction of newly produced RBCs by conditioning on higher u and integrating density along the [Hb] axis as show in the inset in the top right of panel (A). (B) The top panel shows a typical (v,h) distribution that has been transformed onto the u-[Hb] plane in the bottom panel. (C) The typical blood loss response after 1–3 days includes an increase in the fraction of newly produced cells which will have [Hb] more than one standard deviation below the median and u more than one standard deviation above the median (p<1e-3), corresponding to the top right inset in panel (A) and consistent with increased production. (D) 1–3 days following blood loss, the typical response also involves an increase in the fraction of older RBCs, located more than one standard deviation (15%) below the median u (p<1e-3), corresponding to the top left inset in panel (A) and consistent with a delayed clearance. (E) The mean RBC age (MRBC), as estimated by the glycated hemoglobin fraction, has decreased on average by about 4% after 21 days, but there is significant variation, with some subjects seeing an increase in MRBC. (F) The model characterizes the relative balance between increased production and delayed clearance in each subject’s blood response by the dimensionless parameter ratio . The time-weighted average of this ratio after blood loss for each subject is significantly correlated with the estimated change in MRBC (ρ = −0.59), suggesting that the model of (v,h) dynamics has accurately captured the balance of the typical subject’s blood loss response. The red line is a least-squares linear fit. (G) The dimensionless parameter ratio distinguishes subjects whose MRBC becomes shorter (production-dominated) during response to blood loss from those whose MRBC becomes longer (clearance-dominated). (See Figure 2 caption for boxplot description.)
-
Figure 4—source data 1
Source data for boxplots in Figure 4.
- https://cdn.elifesciences.org/articles/48590/elife-48590-fig4-data1-v1.xlsx
Newly produced RBCs have higher volume and lower hemoglobin concentration (d'Onofrio et al., 1995) and appear in the upper right of the (v,h) plane, or the bottom right quadrant of the u-[Hb] plane (Figure 4AB). Figure 3 shows that a simulated increase in Dv is associated with an increase in P(v,h) in this region. We can look for empirical evidence of increased production by conditioning on u being more than one standard deviation above the median and then integrating the marginal [Hb] distribution falling at least one standard deviation (~5%) below the median. Figure 4C shows a significant increase for the typical subject, consistent with RBC production increasing days earlier than the current gold standard reticulocyte count (Figure 2A). We did not find any statistically significant sex-specific differences.
MCHC rise and subsequent fall is consistent with a combination of delayed clearance and increased production
Single-RBC hemoglobin concentration ([Hb]) increases during the first few weeks of an RBC’s lifespan and is then stable (Franco et al., 2013). Clearance delay would therefore enrich the fraction of older RBCs which have [Hb] slightly higher than the population mean, and the population mean [Hb] (MCHC) would increase. On the other hand, increased production in isolation would reduce MCHC by adding more young RBCs with lower [Hb]. For the typical subject, we find that MCHC increases shortly after blood loss and then falls, dropping below the baseline level by 21 days (Figure 5). Both delayed clearance and increased production would be expected to increase the coefficient of variation in [Hb], (cellular hemoglobin distribution width or CHDW) by enriching for RBCs with extreme [Hb], also consistent with measurements of CHDW (Figure 5), which increases after blood loss and remains elevated relative to baseline even 21 days later.

Following blood loss, the MCHC rise and fall and the sustained CHDW rise are consistent with a combination of delayed RBC clearance and increased RBC production.
(Top) The intra-subject MCHC tends to increase immediately after blood loss (left, p<0.05) and then decreases below baseline by 21 days later (right, p<0.01). (Bottom) The intra-subject CHDW increases immediately after blood loss (p<0.002) and then increases again by 21 days later (p<0.002).
-
Figure 5—source data 1
Source data for boxplots in Figure 5.
- https://cdn.elifesciences.org/articles/48590/elife-48590-fig5-data1-v1.xlsx
The model enables estimation of the relative magnitudes of the production increase and clearance delay for individual subjects
The model thus suggests that the response to blood loss includes both delayed clearance (modeled as a higher α and lower vc, or simply higher ) and increased production (modeled as a higher Dv). These two component responses will have opposite effects on the mean RBC age (MRBC), with increased production enriching for younger RBCs and shortening MRBC, and delayed clearance enriching for older RBCs and lengthening MRBC. MRBC can be estimated in these nondiabetic subjects by measuring the glycated hemoglobin fraction (Dornhorst, 1951; Franco, 2009; Khera et al., 2013; Malka et al., 2016; Bunn et al., 1976; Cohen et al., 2008; Dijkstra et al., 2017). Figure 4 shows that this estimated MRBC has decreased by about ~4% for the typical subject by 21 days, consistent with relatively more increased production than delayed clearance for the typical subject, but the balance varies across subjects.
The model can be used to estimate the response ratio for each subject as a dimensionless number: . Higher corresponds to greater production increase and would be expected to shorten MRBC, while lower corresponds to greater clearance delay and would lengthen MRBC. We validate the model by comparing to the change in MRBC estimated from independent measurements of HbA1c and find (Figure 4F) the expected negative correlation (p < 0.002). Subjects whose modeled blood loss response shows transient (v,h) dynamics with relatively higher production increase have a greater reduction in MRBC (Figure 4G).
Perturbations to single-RBC volume and hemoglobin distributions persist for at least 21 days after loss
The model thus finds that volume and hemoglobin dynamics of the typical RBC are significantly altered shortly after blood loss and remain altered for at least 21 days. Because P(v,h,t) is determined by these dynamics, our results imply that it should be possible to distinguish 21-day post-blood loss CBCs from pre-blood loss CBCs based only on P(v,h), without having to consider measurements of cell count or concentration like HGB, HCT, or reticulocyte count. We used machine learning methods to classify measurements of P(v,h) and achieved cross-validated performance > 98% (AUC 0.98) with multiple methods (quadratic discriminants, complex trees, etc.). By comparison, this classification by P(v,h) was actually significantly more accurate than classification using only the currently standard count-based markers (HCT and reticulocyte count, accuracy 93%, AUC 0.90).
Discussion
This single-cell model of routinely available clinical data provides a mechanistic link between the (v,h) distribution and changes in the RBC age distribution. The model identifies delayed RBC clearance as an important unrecognized component of the compensatory response to blood loss, and it enables more nuanced and precise inferences about the homeostatic response to a fundamental pathologic process in different individuals.
Our analysis begins with a mechanistic model and leads to identification of empirical changes in the (v,h) distribution that are associated with the response to blood loss. A non-mechanistic approach comparing arbitrary distribution statistics before and after blood loss may also be fruitful, but given the large number of potential statistics on distributions of tens of thousands of measurements and the small number of cases (n = 28), statistical significance of the identified associations would likely be limited. More importantly, the advantage of a mechanistic modeling approach either in addition to or instead of a purely statistical or machine learning approach is that it provides a hypothesized physiologic context. Additional falsifiable predictions may then be deduced to provide further validation opportunities, as shown for instance in Figure 5. A mechanistic model also enables assessment of counterfactuals, which is particularly important in the clinical context, where patient factors or pre-existing conditions not present in discovery or development cohorts might significantly compromise accuracy when inference methods are applied to real-world populations. An understanding of the mechanistic basis for an inference method or algorithm will increase the likelihood that these problematic situations can be anticipated and perhaps avoided. In the context of this study, such conditions may include transfusion, sickle cell disease, or mechanical RBC stresses altering RBC volume associated with disseminated intravascular coagulation, microangiopathic hemolytic anemia, and other related pathologic processes.
The model has potential for immediate clinical decision support by detecting increased RBC production earlier than the current gold standard reticulocyte count in our study cohort. Further study is needed to compare the transient (v,h) dynamics in patients with active disease processes and to investigate which factors control the production/clearance ratio of a subject’s blood loss response. As more single-cell methods mature, modeling of higher-dimensional cell states will enable richer understanding of physiologic homeostasis and adaptation and help realize the vision for precision medicine.
Materials and methods
Human subjects
Request a detailed protocolAll 28 subjects (18 male, 10 female) enrolled in the study were healthy and athletically active individuals aged 18 to 40 on the day of enrollment. The study size provided at least four same-sex biological replicates and allowed for the possibility of a 50% dropout during the study. Subjects were excluded from enrollment if they participated in competitive sporting events during the study procedures, or if they were a member of a registered anti-doping testing pool for any international sporting federations, national anti-doping organizations, or professional sporting organizations. Prior to study commencement, all participants provided written, informed consent. Approval for study procedures was granted by the University of Utah Institutional Review Board (IRB Protocol #00083533) and for analysis of human subject data by the Partners Healthcare Institutional Review Board. An outline of the study design and collection time points is shown in Figure 1.
Blood collection
Request a detailed protocolPrior to each blood collection, subjects were seated with their feet on the floor for a minimum of ten minutes per World Anti-Doping Agency blood collection guidelines (https://www.wada-ama.org/en/resources/world-anti-doping-program/guidelines-blood-sample-collection). After the ten-minute equilibration period, blood was collected via venipuncture of an antecubital vein into one 6 mL serum-separator tube and one 6 mL K2EDTA (BD Vacutainer) tube. After collection, whole blood samples were immediately refrigerated until analysis. Additional aliquots were stored at −80C for HbA1c measurement. Following three baseline collections over the course of 2-4 weeks, each subject in the study donated one unit of blood (~475 mL) according to Associated Regional and University Pathologists (ARUP) standard operating procedures.
CBC measurements
Request a detailed protocolWhole blood samples collected in K2EDTA tubes were measured for a Complete Blood Count plus reticulocyte% using a Siemens Advia 2120i. Briefly, samples were brought from refrigerated to room temperature while on a nutating mixer for at least 15 min prior to analysis. All samples were measured in duplicate. All samples were collected in Salt Lake City, Utah, at either the Sports Medicine Research and Testing Laboratory (SMRTL) or the University of Utah Hospital. The approximate altitude at these locations is 1400 m above sea level. All subjects in the study were residents at this altitude and are assumed to be adapted to the environment.
Model details
Request a detailed protocolWe measured CBCs roughly every other day for a week for all subjects and used the model to infer each subject’s baseline RBC population dynamics between these 4 timepoints (e.g., t = 1, 3, 5, and 7 days). At t = 1, b(v,h,1) is measured and used to estimate source terms extending back in time by a number of days equivalent to the RBC lifespan (LS): . The RBC age distribution is assumed to be uniform with nominal (Cohen et al., 2008). The first CBC provides a sample of P(v,h,1), and Equation 1 can be used to estimate the parameters characterizing the RBC population dynamics at baseline: (Higgins and Mahadevan, 2010; Patel et al., 2015). The transient dynamics between t = 1 and t = 3 can be estimated using p1 and Equation 1. Initial conditions at t = 1 are determined by integrating Equation 1 for LS – 2 days with a source term equal to b(v,h,1). The CBC measured on day 3 (t = 3) provides a direct estimate of b(v,h,3) and a sample of P(v,h,3). Equation 1 is then used to estimate p3, the parameters characterizing the transient dynamics between t = 1 and t = 3. This process is repeated for each successive CBC to provide quantification of the transient dynamics as shown in Figure 2. See Video 1 for additional detail.
The video shows a simulation of the evolution of P(v,h,t) from t = 0 to t = 105 days for a typical study subject.
Equal-probability contours for P(v,h) are shown at the bottom, with the empirical measurement as blue dashed lines, and the simulation in solid red. The surface plot also shows the simulated P(v,h,t). The plot of the empirical measurement in dashed blue is serially updated during the movie to the measurement subsequent to the value of t. Marginal distributions, P(v,t), and P(h,t), are shown at the sides along with empirical measurements in blue.
In the Fokker-Planck equation describing the RBC maturation dynamics (Equation 1), the drift term is expressed as a combination of an initial fast phase, followed by a slow phase. In Equation 2, v and h are normalized by their sample population means, and both approach 1 as the fast phase transitions to the slow phase:
In Equation 1 and 2, P refers to the volume-hemoglobin probability distribution of the RBC population, D is the diffusion matrix , and α, , and parameterize the drift processes. The birth term b(v,h,t) is estimated by reticulocyte count measurements at time t along with the RBC population, with b(v,h) defined by the volume and hemoglobin distribution measured for reticulocytes identified using standard validated clinical laboratory techniques (d'Onofrio et al., 1995). The clearance term, d(v,h) is defined as follows:
Here, and are the MCV and MCH, respectively, and vc parameterizes the clearance boundary region (Higgins and Mahadevan, 2010; Patel et al., 2015).
Data availability
All data we are authorized to share according to our Institutional Review Board approved study protocols for patient data collection and subsequent analysis and publication is included in the manuscript. We are able to share de-identified study subject data, and we have provided source data files for all appropriate figures as tables in spreadsheets.
References
-
The biosynthesis of human hemoglobin A1c. slow glycosylation of hemoglobin in vivoJournal of Clinical Investigation 57:1652–1659.https://doi.org/10.1172/JCI108436
-
ErythropoietinCold Spring Harbor Perspectives in Medicine 3:a011619.https://doi.org/10.1101/cshperspect.a011619
-
The measurement and importance of red cell survivalAmerican Journal of Hematology 84:109–114.https://doi.org/10.1002/ajh.21298
-
Changes in the properties of normal human red blood cells during in vivo agingAmerican Journal of Hematology 88:44–51.https://doi.org/10.1002/ajh.23344
-
In vivo volume and hemoglobin dynamics of human red blood cellsPLOS Computational Biology 10:e1003839.https://doi.org/10.1371/journal.pcbi.1003839
-
Mechanistic modeling of hemoglobin glycation and red blood cell kinetics enables personalized diabetes monitoringScience Translational Medicine 8:359ra130.https://doi.org/10.1126/scitranslmed.aaf9304
-
Modulation of red blood cell population dynamics is a fundamental homeostatic response to diseaseAmerican Journal of Hematology 90:422–428.https://doi.org/10.1002/ajh.23982
-
Clinical utility of reticulocyte parametersClinics in Laboratory Medicine 35:133–163.https://doi.org/10.1016/j.cll.2014.10.004
-
Models for the red blood cell lifespanJournal of Pharmacokinetics and Pharmacodynamics 43:259–274.https://doi.org/10.1007/s10928-016-9470-4
-
SoftwareRegulation of erythropoiesisuptodate.com.
-
Stochastic Processes in Physics and ChemistryNorth-Holland Personal Library, Stochastic Processes in Physics and Chemistry, Amsterdam, Elsevier.
-
Erythrocyte vesiculation: a self-protective mechanism?British Journal of Haematology 141:549–556.https://doi.org/10.1111/j.1365-2141.2008.07055.x
Decision letter
-
Naama BarkaiSenior and Reviewing Editor; Weizmann Institute of Science, Israel
-
Ziad ObermeyerReviewer
-
Steven SpitalnikReviewer; Columbia University, United States
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
The paper uses experimentally induced blood loss to study red cell dynamics, with the innovation of using single-cell measurements before and after blood loss to parameterize a model of red cell birth, dynamics, and clearance. This allows the authors to draw conclusions about production of red blood cells (detectable earlier than current methods) and previously unsuspected decreases in clearance. The paper is a good example of how the kinds of rich data routinely collected by clinical labs, but often ignored in clinical practice, can be made sense of in the service of both clinical care – here, early detection of blood loss – and physiological understanding of biological processes.
Decision letter after peer review:
Thank you for submitting your article "Single-Cell Modeling of Routine Clinical Blood Tests Reveals Transient Dynamics of Human Response to Blood Loss" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by Naama Barkai as the Senior and Reviewing Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Ziad Obermeyer (Reviewer #1); Steven Spitalnik (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
As you can see, both reviewers appreciated the importance of your analysis and findings, and supported publication. Provided that the concerns of reviewer #1 can be satisfactory addressed, the paper can be accepted.
Reviewer #1:
This paper uses experimentally induced blood loss to study red cell dynamics. The innovation is to use single-cell measurements from this setup, before and after blood loss, to estimate parameters of a model of red cell birth, dynamics, and clearance. This allows the authors to draw conclusions about production of RBCs (detectable earlier than current methods) and previously unsuspected decreases in clearance.
Overall I found the paper to be a compelling early example of how the kinds of rich data routinely collected by clinical labs, but generally ignored in clinical practice, can be made sense of in the service of both clinical care (e.g. early detection of blood loss) and physiological understanding.
I had a few questions about the assumptions – to the extent these concerns are unfounded, I would have appreciated more explanation of them in the Materials and methods.
- The authors state that blood loss is essentially a random draw of RBCs (i.e. that the distribution P is unchanged before and after). Is this assumed or known? I can imagine several scenarios where different kinds of cells could be more or less likely to be in capillaries vs. larger vessels accessed by venipuncture. I can also imagine that such non-random changes induced directly by drawing blood would pose problems for estimating model parameters as the authors do, and I would have appreciated reassurance or some discussion of this.
- The birth function is estimated by measuring reticulocytes, but I could not tell whether reticulocytes were inferred from the size distribution (i.e. by assuming they are drawn from some distribution of high (v,h)) or measured directly. If the former, how sensitive are measurements to the assumptions used to define reticulocytes?
- What altitude were the blood draws taken at? And how does the sample frame how we should interpret results? I ask because patients seem to be athletes in Utah – things might be different in Park City vs. Boston.
I have a broader point on framing. One of the compelling practical aspects of the paper is the idea that the model parameters can be used as a new way to infer early blood loss.
- The authors set up the current CBC parameters as the straw man, and note that their methods perform better in a variety of ways. Fair enough. But these former measures are almost laughably simple – two means and two variances (even the second variance is, if I'm not mistaken, not commonly reported). A better comparison would be some more sophisticated measures of the marginal v and h distributions, and especially measures of covariance.
- It's possible that a larger set of X's (right hand side variables) such as these would do just as well as the model derived parameters in predicting whether a measurement was taken before or after blood loss, particularly if fed into a good machine learning model – after all, the model is picking up on some empirical shifts in distribution and (given infinite data) it would be impossible for the structural model to do better at predicting something than a good prediction model itself. If this is not true, all the better – I can easily imagine that in small samples the model does much better than a kitchen sink + ML approach, but this is in itself worth showing.
- Regardless, one of the primary benefits of having a structural model of the physiology (as opposed to a bunch of measurements + ML) seems to be to perform counterfactual simulations. While this is not my area, I can imagine a number of interesting questions – how would the dynamics change under a range of different conditions: different volumes or chronicity of blood loss (e.g. from colon cancer rather than a unit of drawn blood), etc. One could also specifically model the kinds of changes that would be detected by single cell measures but specifically not by standard measures.
Finally, as a style point, I was a bit overwhelmed by all the figures. Many of the subfigures were not even discussed in the text, which may be a sign that they belong as supplementary Figures. I found the video quite informative (though would have liked the marginals projected as well) and wonder if putting a few frames from this as a figure would help give intuitions about what is actually happening empirically. Overall, refocusing on the main innovations of the paper and cutting some unnecessary material would be helpful to the reader.
Reviewer #2:
This is the latest in a series of interesting and provocative studies from Dr. Higgins and his colleagues. They have identified novel ways of "mining" data from routine CBCs to provide additional clinical insights and identify underlying mechanisms and/or opportunities for further research. This manuscript similarly succeeds in these regards.
In particular, by studying otherwise healthy volunteers, they identify that some individuals respond to an acute blood loss by, predominantly, rapidly producing new RBCs, whereas others respond by, predominantly, slowing down clearance of existing, circulating RBCs. To my knowledge, these are new and very interesting findings, particularly the latter. What distinguishes these individuals in their predominant response characteristics? Genetics? Diet? Environmental influences? Other things? This will provide a rich opportunity for future studies.
In addition, it will be interesting, in the future, to investigate how various patient populations, with various underlying disorders, respond to acute blood loss, whether that blood loss is pathological (e.g., a GI bleed or trauma) or iatrogenic (e.g., during and following surgery). Unravelling the underlying mechanisms will be important in expanding our knowledge of basic pathophysiology and may also affect how physicians respond therapeutically.
Finally, although one can provide plausible underlying mechanisms, based on prior work, regarding how humans respond to acute blood loss by increasing RBC production, it is harder to conceive of how clearance of aging RBCs is regulated in this setting. How does the mononuclear phagocyte system "recognize" acute blood loss and then down-regulate clearance accordingly? This interesting conundrum opens an important new area for investigation.
https://doi.org/10.7554/eLife.48590.sa1Author response
Reviewer #1:
[…] I had a few questions about the assumptions – to the extent these concerns are unfounded, I would have appreciated more explanation of them in the Materials and methods.
The reviewer raises valid questions about assumptions and our approach that we address in our revised manuscript as suggested, with details below.
- The authors state that blood loss is essentially a random draw of RBCs (i.e. that the distribution P is unchanged before and after). Is this assumed or known? I can imagine several scenarios where different kinds of cells could be more or less likely to be in capillaries vs. larger vessels accessed by venipuncture. I can also imagine that such non-random changes induced directly by drawing blood would pose problems for estimating model parameters as the authors do, and I would have appreciated reassurance or some discussion of this.
We have clarified in our revised Results section that this assumption is based on prior labeling studies which model the residual lifespan distribution of labeled RBCs (after reinfusion and recollection) to infer that a blood draw is a “random sample” consisting of “a mixture of RBCs of all ages.” (Shrestha et al., 2016) We make two further notes. First, the evidence for this assumption is indirect, relying on models of RBC lifespan distributions, because there is no accepted direct measurement or marker of RBC age, and some uncertainty regarding this assumption is warranted, as we now state in our revised manuscript. Second, we agree with the reviewer that factors such as vascular geometry likely do have differential effects on RBCs with different properties, such as volume, and to the extent that these different properties are associated with age, blood draws from these different locations would be expected to provide incompletely random samples of the age distribution. We expect that the magnitude of these differences is less than the analytic precision of the current volume and hemoglobin distribution measurements. We are not aware of any evidence that blood draws from different anatomic sites yield significantly different RBC volume or hemoglobin distributions (though factors like the postural state of the patient do effect HCT and HGB (Leppänen and Gräsbeck, 2009)), but this question has not been systematically investigated as far as we know.
- The birth function is estimated by measuring reticulocytes, but I could not tell whether reticulocytes were inferred from the size distribution (i.e. by assuming they are drawn from some distribution of high (v,h)) or measured directly. If the former, how sensitive are measurements to the assumptions used to define reticulocytes?
We have revised our description of the birth function in our Materials and methods section to make clearer that we are defining it using the standard validated reticulocyte measurement (d’Onofrio et al., 1995), which provides the distribution of volume and hemoglobin masses of reticulocytes.
- What altitude were the blood draws taken at? And how does the sample frame how we should interpret results? I ask because patients seem to be athletes in Utah – things might be different in Park City vs. Boston.
As we now state in our revised Materials and methods, all samples were collected in Salt Lake City, Utah, at approximately 1400 m above sea level, and because all study subjects were residents of the area and were adapted to this altitude, we do not expect altitude to have a significant effect on our results.
I have a broader point on framing. One of the compelling practical aspects of the paper is the idea that the model parameters can be used as a new way to infer early blood loss.
- The authors set up the current CBC parameters as the straw man, and note that their methods perform better in a variety of ways. Fair enough. But these former measures are almost laughably simple – two means and two variances (even the second variance is, if I'm not mistaken, not commonly reported). A better comparison would be some more sophisticated measures of the marginal v and h distributions, and especially measures of covariance.
- It's possible that a larger set of X's (right hand side variables) such as these would do just as well as the model derived parameters in predicting whether a measurement was taken before or after blood loss, particularly if fed into a good machine learning model – after all, the model is picking up on some empirical shifts in distribution and (given infinite data) it would be impossible for the structural model to do better at predicting something than a good prediction model itself. If this is not true, all the better – I can easily imagine that in small samples the model does much better than a kitchen sink + ML approach, but this is in itself worth showing.
The reviewer raises an important and timely topic. As we now state in our revised Discussion, while our analysis begins with a mechanistic model and leads to identification of empirical changes in the (v,h) distribution that are associated with the response to blood loss, a brute force machine learning approach comparing arbitrary distribution statistics before and after blood loss would also be fruitful. The statistics to which the model leads us are associated with blood loss and would be identified by the machine learning approach as the reviewer notes, as would a very large number of other correlated statistics. The challenge as the reviewer notes is that in this case there is a vastly larger number of potential statistics on distributions of tens of thousands of measurements, and coupled with the small number of cases (n = 28), statistical significance of identified associations would be more difficult if not impossible to establish, if mechanistic insights were not used to bias the prioritization of statistics to investigate.
- Regardless, one of the primary benefits of having a structural model of the physiology (as opposed to a bunch of measurements + ML) seems to be to perform counterfactual simulations. While this is not my area, I can imagine a number of interesting questions – how would the dynamics change under a range of different conditions: different volumes or chronicity of blood loss (e.g. from colon cancer rather than a unit of drawn blood), etc. One could also specifically model the kinds of changes that would be detected by single cell measures but specifically not by standard measures.
As we note in our revised Discussion, we strongly agree with the reviewer that the advantage of the mechanistic modeling approach either in addition to or instead of machine learning is that it provides a hypothesized physiologic context enabling further validation by testing many additional falsifiable predictions derived by logical extension, some of which are mentioned by the reviewer, as well as counterfactuals. In the clinical context, we agree with the reviewer that the ability to assess counterfactuals is particularly important, as is also highlighted by comments from reviewer #2 below: if we understand the mechanistic basis for a clinical predictor, then we will be in a better position to identify underlying disorders that may make the predictor misleading (e.g., transfusion, sickle cell disease, microangiopathic hemolytic anemia, xerocytosis, some medications, etc.).
Finally, as a style point, I was a bit overwhelmed by all the figures. Many of the subfigures were not even discussed in the text, which may be a sign that they belong as supplementary figures. I found the video quite informative (though would have liked the marginals projected as well) and wonder if putting a few frames from this as a figure would help give intuitions about what is actually happening empirically. Overall, refocusing on the main innovations of the paper and cutting some unnecessary material would be helpful to the reader.
We appreciate the good suggestions and have moved the unreferenced figure panels into supplementary figures. We have also added the marginal distributions to the video and have added frames to Figure 1.
Reviewer #2:
This is the latest in a series of interesting and provocative studies from Dr. Higgins and his colleagues. They have identified novel ways of "mining" data from routine CBCs to provide additional clinical insights and identify underlying mechanisms and/or opportunities for further research. This manuscript similarly succeeds in these regards.
In particular, by studying otherwise healthy volunteers, they identify that some individuals respond to an acute blood loss by, predominantly, rapidly producing new RBCs, whereas others respond by, predominantly, slowing down clearance of existing, circulating RBCs. To my knowledge, these are new and very interesting findings, particularly the latter. What distinguishes these individuals in their predominant response characteristics? Genetics? Diet? Environmental influences? Other things? This will provide a rich opportunity for future studies.
We are very pleased that the reviewer finds our study interesting and provocative. We agree that it will be very interesting to investigate the factors that determine the different individual level responses, and we anticipate that discovery of relevant factors will further elucidate fundamental mechanisms of RBC pathophysiology. Based in part on some of the reviewer’s prior work, we suspect that small changes in iron levels, even within the reference interval, may be relevant, and that is a high-priority area for future investigation but will require a larger study or a design more focused on that hypothesis.
In addition, it will be interesting, in the future, to investigate how various patient populations, with various underlying disorders, respond to acute blood loss, whether that blood loss is pathological (e.g., a GI bleed or trauma) or iatrogenic (e.g., during and following surgery). Unravelling the underlying mechanisms will be important in expanding our knowledge of basic pathophysiology and may also affect how physicians respond therapeutically.
Finally, although one can provide plausible underlying mechanisms, based on prior work, regarding how humans respond to acute blood loss by increasing RBC production, it is harder to conceive of how clearance of aging RBCs is regulated in this setting. How does the mononuclear phagocyte system "recognize" acute blood loss and then down-regulate clearance accordingly? This interesting conundrum opens an important new area for investigation.
We agree that evidence for mechanisms modulating RBC clearance is lacking. We speculate that signals enhancing RBC production, such as erythropoietin, might play a dual role and potentially increase the likelihood that a mononuclear phagocyte would down-regulate its clearance activity, but any evidence supporting this hypothesis is currently absent.
https://doi.org/10.7554/eLife.48590.sa2Article and author information
Author details
Funding
National Institutes of Health (1DP2DK098087)
- John M Higgins
Partnership for Clean Competition
- Daniel Eichner
- John M Higgins
Life Sciences Research Foundation (Good Ventures Fellowship)
- Anwesha Chaudhury
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This study was funded by the NIH, the Life Sciences Research Foundation (LSRF), and the Partnership for Clean Competition. None of the funders played any role in the decision to submit for publication. The authors appreciate expert advice on HbA1c testing from Dr. Randie Little and the technical assistance of the Diabetes Diagnostic Laboratory at the University of Missouri Medical School. The authors acknowledge instrument and testing support from Siemens Healthcare Diagnostics and Sebia Diagnostics. AC is a Good Ventures Fellow of the Life Sciences Research Foundation. All simulations were run on the Harvard Medical School O2 cluster. The authors would like to thank Jonathan Carlson, Bronner Goncalves, Michael Dworkin, Erica Normandin, Charles Pedlar, and Rebecca Ward for helpful discussions.
Ethics
Human subjects: Approval for study procedures was granted by the University of Utah Institutional Review Board (IRB Protocol #00083533) and for analysis of human subject data by the Partners Healthcare Institutional Review Board.
Senior and Reviewing Editor
- Naama Barkai, Weizmann Institute of Science, Israel
Reviewers
- Ziad Obermeyer
- Steven Spitalnik, Columbia University, United States
Publication history
- Received: May 20, 2019
- Accepted: November 8, 2019
- Version of Record published: December 17, 2019 (version 1)
Copyright
© 2019, Chaudhury et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 698
- Page views
-
- 93
- Downloads
-
- 9
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
The mouse brain is by far the most intensively studied among mammalian brains, yet basic measures of its cytoarchitecture remain obscure. For example, quantifying cell numbers, and the interplay of sex, strain, and individual variability in cell density and volume is out of reach for many regions. The Allen Mouse Brain Connectivity project produces high-resolution full brain images of hundreds of brains. Although these were created for a different purpose, they reveal details of neuroanatomy and cytoarchitecture. Here, we used this population to systematically characterize cell density and volume for each anatomical unit in the mouse brain. We developed a DNN-based segmentation pipeline that uses the autofluorescence intensities of images to segment cell nuclei even within the densest regions, such as the dentate gyrus. We applied our pipeline to 507 brains of males and females from C57BL/6J and FVB.CD1 strains. Globally, we found that increased overall brain volume does not result in uniform expansion across all regions. Moreover, region-specific density changes are often negatively correlated with the volume of the region; therefore, cell count does not scale linearly with volume. Many regions, including layer 2/3 across several cortical areas, showed distinct lateral bias. We identified strain-specific or sex-specific differences. For example, males tended to have more cells in extended amygdala and hypothalamic regions (MEA, BST, BLA, BMA, and LPO, AHN) while females had more cells in the orbital cortex (ORB). Yet, inter-individual variability was always greater than the effect size of a single qualifier. We provide the results of this analysis as an accessible resource for the community.
-
- Computational and Systems Biology
- Immunology and Inflammation
To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial assembly of V(D)J-genes and the junctional deletion and insertion of nucleotides. While the Artemis protein is often regarded as the main nuclease involved in V(D)J recombination, the exact mechanism of nucleotide trimming is not understood. Using a previously published TCRβ repertoire sequencing data set, we have designed a flexible probabilistic model of nucleotide trimming that allows us to explore various mechanistically interpretable sequence-level features. We show that local sequence context, length, and GC nucleotide content in both directions of the wider sequence, together, can most accurately predict the trimming probabilities of a given V-gene sequence. Because GC nucleotide content is predictive of sequence-breathing, this model provides quantitative statistical evidence regarding the extent to which double-stranded DNA may need to be able to breathe for trimming to occur. We also see evidence of a sequence motif that appears to get preferentially trimmed, independent of GC-content-related effects. Further, we find that the inferred coefficients from this model provide accurate prediction for V- and J-gene sequences from other adaptive immune receptor loci. These results refine our understanding of how the Artemis nuclease may function to trim nucleotides during V(D)J recombination and provide another step toward understanding how V(D)J recombination generates diverse receptors and supports a powerful, unique immune response in healthy humans.