Derivation and external validation of clinical prediction rules identifying children at risk of linear growth faltering
Abstract
Background:
Nearly 150 million children under-5 years of age were stunted in 2020. We aimed to develop a clinical prediction rule (CPR) to identify children likely to experience additional stunting following acute diarrhea, to enable targeted approaches to prevent this irreversible outcome.
Methods:
We used clinical and demographic data from the Global Enteric Multicenter Study (GEMS) to build predictive models of linear growth faltering (decrease of ≥0.5 or ≥1.0 in height-for-age z-score [HAZ] at 60-day follow-up) in children ≤59 months presenting with moderate-to-severe diarrhea, and community controls, in Africa and Asia. We screened variables using random forests, and assessed predictive performance with random forest regression and logistic regression using fivefold cross-validation. We used the Etiology, Risk Factors, and Interactions of Enteric Infections and Malnutrition and the Consequences for Child Health and Development (MAL-ED) study to (1) re-derive, and (2) externally validate our GEMS-derived CPR.
Results:
Of 7639 children in GEMS, 1744 (22.8%) experienced severe growth faltering (≥0.5 decrease in HAZ). In MAL-ED, we analyzed 5683 diarrhea episodes from 1322 children, of which 961 (16.9%) episodes experienced severe growth faltering. Top predictors of growth faltering in GEMS were: age, HAZ at enrollment, respiratory rate, temperature, and number of people living in the household. The maximum area under the curve (AUC) was 0.75 (95% confidence interval [CI]: 0.75, 0.75) with 20 predictors, while 2 predictors yielded an AUC of 0.71 (95% CI: 0.71, 0.72). Results were similar in the MAL-ED re-derivation. A 2-variable CPR derived from children 0–23 months in GEMS had an AUC = 0.63 (95% CI: 0.62, 0.65), and AUC = 0.68 (95% CI: 0.63, 0.74) when externally validated in MAL-ED.
Conclusions:
Our findings indicate that use of prediction rules could help identify children at risk of poor outcomes after an episode of diarrheal illness. They may also be generalizable to all children, regardless of diarrhea status.
Funding:
This work was supported by the National Institutes of Health under Ruth L. Kirschstein National Research Service Award NIH T32AI055434 and by the National Institute of Allergy and Infectious Diseases (R01AI135114).
Editor's evaluation
This work would be valuable to global health scientists, particularly in low- and middle-income countries where childhood stunting is an ongoing challenge, and to statisticians interested in building clinical prediction rules. The authors' solid methodology leveraged large, rich datasets from multi-center studies to build and validate predictive models.
https://doi.org/10.7554/eLife.78491.sa0Introduction
Despite recent advances in the prevention and treatment of childhood malnutrition, nearly 150 million children under-5 years of age were stunted in 2020 (UNICEF et al., 2021). Stunting is defined as a length- or height-for-age z-score (HAZ) 2 or more standard deviations below the population median (de Onis et al., 2013), and is considered both an indicator of underlying deficits (i.e., chronic malnutrition, The World Bank, 2021), as well as a potential contributor to future health problems (e.g., through poor immune system maturation, Rytter et al., 2014; Bourke et al., 2016). Furthermore, stunting has been consistently associated with increased risk of morbidity and mortality, delayed or deficient cognitive development, and reduced educational attainment (McDonald et al., 2013; Black et al., 2013; Olofin et al., 2013; Adair et al., 2013; de Onis and Branca, 2016; Black et al., 2008; Bhaskaram, 2002). Timely and accurate identification of children most likely to experience stunting offers an opportunity to prevent such negative health outcomes.
Stunting has been linked with diarrheal diseases across many settings (Checkley et al., 2008). An estimated 10.9% of global stunting is attributable to diarrhea (Danaei et al., 2016), and a child with diarrhea is more likely to have a lower HAZ or to die than age-matched controls (Kotloff et al., 2013). Given the 1.1 billion episodes of childhood diarrhea that occur globally every year (Collaborators GBDDD, 2018), assessment of children seeking healthcare for diarrhea treatment provides an opportunity to identify those at increased risk for negative outcomes, including stunting and death. Once identified, these children could be specifically targeted for intensive interventions, thereby more efficiently allocating public health resources.
In this study, we aimed to develop parsimonious, easy to implement clinical prediction rules (CPRs) to identify children under-5 most likely to experience linear growth faltering among community-dwelling children presenting to care for acute diarrhea. CPRs are algorithms that aid clinicians in interpreting clinical findings and making clinical decisions (Reilly and Evans, 2006). Linear growth faltering, or falling below standardized height/length growth trajectory projections, captures children whose growth has slowed precipitously and is a precursor of stunting. A number of prior studies have identified risk factors for linear growth faltering (Danaei et al., 2016; Prado et al., 2019; Sofiatin et al., 2019; Naylor et al., 2015; Zhang et al., 2017; Richter et al., 2018; Schott et al., 2013; Richard et al., 2019; Rogawski et al., 2017; Rogawski et al., 2018), but many of these were single-site studies using traditional model building approaches, some of which lacked appropriate assessments of model discrimination and calibration. Building on this body of literature, we used machine learning methods on data from two large multicenter studies to derive and externally validate prognostic prediction models for growth faltering, with the hopes of reliably identifying children that would most benefit from additional nutritional intervention after care for acute diarrhea.
Methods
Study population for derivation cohort 1 (GEMS)
We used data from The Global Enteric Multicenter Study (GEMS) to derive CPRs for growth faltering. The GEMS study has been described elsewhere in-depth (Kotloff et al., 2013; Kotloff et al., 2012). Briefly, GEMS was a prospective case–control study of acute moderate-to-severe diarrhea (MSD) in children 0–59 months of age. Data were collected in December 2007–March 2011 from seven sites in Africa and Asia, including those in Mali, The Gambia, Kenya, Mozambique, Bangladesh, India, and Pakistan. MSD was defined as diarrhea accompanied by one or more of the following: dysentery, dehydration, or hospital admission. Diarrhea was defined as new onset (after ≥7 days diarrhea-free) of three or more looser than normal stools in the previous 24 hr lasting 7 days or less. Cases were enrolled at initial presentation to a sentinel hospital or health center, and matched within 14 days to one to three controls without diarrhea enrolled from the community. Demographics, epidemiological, and clinical information were collected from caregivers of both cases and controls via standardized questionnaires, and clinic staff conducted physical examinations and collected stool samples which have undergone conventional and molecular testing to ascertain the pathogen that caused the diarrhea. Approximately 60 days (up to 91) after enrollment, fieldworkers visited the homes of both cases and controls to collect standardized clinical and epidemiological information and repeat anthropometry.
Children were excluded if follow-up observations occurred <49 or >91 days after enrollment, or if HAZ measurements were implausible (Brander et al., 2019), defined as: (1) HAZ >6 or HAZ <−6; (2) change in HAZ >3; (3) >1.5 cm loss of height from enrollment to follow-up; (4) growth of >8 or >4 cm at 49- to 60-day follow-up for children ≤6 and >6 months old, respectively; (5) growth >10 or >6 cm at 61- to 91-day follow-up for children ≤6 and >6 months old, respectively.
Parents or caregivers of participants provided informed consent, either in writing or witnessed if parents or caregivers were illiterate. The GEMS study protocol was approved by ethical review boards at each field site and the University of Maryland, Baltimore, USA. This analysis utilized publicly available data, see Data Availability statement, and as such is non-human subjects research.
Study population for derivation and validation cohort 2 (MAL-ED)
We used the Etiology, Risk Factors, and Interactions of Enteric Infections and Malnutrition and the Consequences for Child Health and Development (MAL-ED) study to (1) re-derive the best full model, and (2) externally validate a 2-variable parsimonious version of our GEMS-derived CPR for growth faltering. MAL-ED is a longitudinal birth cohort study, and study details have been described elsewhere (MAL-ED Network Investigators, 2014; Platts-Mills et al., 2014; Platts-Mills et al., 2015; Richard et al., 2014). In brief, healthy children were enrolled within 17 days of birth and followed prospectively through 24 months of age. Children were enrolled from October 2009 to March 2012 from eight countries in Asia, Africa, and South America, including Tanzania, South Africa, Pakistan, India, Nepal, Bangladesh, Peru, and Brazil. Information on household, demographic, and clinical data from mother and child were collected at enrollment and reassessed periodically, and illness and feeding information was collected at twice-weekly household visits.
In MAL-ED, diarrhea was defined as maternal report of three or more loose stools in a 24-hr period, or one loose stool with blood. Each diarrhea episode had to be separated by at least 2 days without diarrhea in order to qualify as distinct diarrhea episodes. To match MAL-ED longitudinal cohort active surveillance data to GEMS, in which children were enrolled upon presentation to clinic with acute diarrhea, we linked anthropometric measurements and other predictor variables with diarrhea episodes in MAL-ED using the following methods (https://github.com/LeungLab/CPRgrowthfaltering): First, each episode of diarrhea was linked to the closest HAZ measurement from before the onset of diarrhea symptoms, but no more than 31 days beforehand. Each diarrhea episode was also linked with the HAZ measurement closest to 75 days after the onset of diarrhea symptoms, but within 49 and 91 days inclusive. Second, each diarrhea episode was linked to the closest observation of each potential predictor variable. Each dietary intake variable had to be observed within 90 days of the diarrhea episode, and each household descriptor variable had to be observed within 6 months of the onset of diarrhea in order to be eligible, otherwise those predictors were considered missing for that specific diarrhea episode. Finally, data were split into age categories, and only one diarrhea episode per enrolled child per model was randomly selected without replacement for analysis.
The same inclusion/exclusion criteria were applied as listed above for the GEMS growth faltering analysis, with the exception that the allowed follow-up period extended up to and including 95 days.
Parents or caregivers of participants provided informed consent. The MAL-ED study protocol was approved by ethical review boards at each field site and the Johns Hopkins Institutional Review Board, Baltimore, USA. This analysis utilized publicly available data, see Data Availability statement, and as such is non-human subjects research.
Outcomes
We defined growth faltering as a decrease in HAZ of ≥0.5 HAZ within 49–91 days of enrollment in GEMS, or within 49–95 days in MAL-ED.
Predictive variables
In GEMS, potential predictors included over 130 descriptors of the child, household, and community, collected at enrollment (Supplementary file 1). Collinear or conceptually similar predictors were removed from consideration to maximize model utility (e.g., HAZ, but not MUAC was considered in the main model). We considered individual components of household wealth, but did not explore the composite wealth variable used in other reports (Brander et al., 2019) since its utilization in a CPR would require collecting multiple parameters that were already being considered individually.
In MAL-ED, we considered 60 potential predictors of growth faltering (Supplementary file 1). We limited possible predictor variables to those that would be easily assessable upon presentation to clinic in a low-resource setting (i.e., did not consider characteristics that required diagnostic testing), and again only considered individual components of combination indicators (e.g., wealth index, Vesikari score).
Statistical analysis
In our complete-case analysis, we screened variables using variable importance measures from random forests to identify the most predictive variables. Random forests are an ensemble learning method whereby multiple decision trees (1000 throughout this analysis) are built on bootstrapped samples of the data with only a random sample of potential predictors considered at each split, thereby decorrelating the trees and reducing variability (James et al., 2013). Throughout this analysis, the number of variables considered at each split was equal to the square root of the total number of potential variables, rounded down. Variables were ranked by predictive importance based on the reduction in mean squared prediction error achieved by including the variable in the predictive model on out-of-bag samples (i.e., observations not in the bootstrapped sample).
Generalizable performance was assessed using fivefold repeated cross-validation. In each of 100 iterations, random forests were fit to a training dataset (random 80% sample of analytic dataset), and variable were ranked using the random forest importance measure as above. Separate logistic regression and random forest regression models were then fit to a subset of the top predictive variables in the training dataset. Subsets examined were the top 1–10, 15, 20, 30, 40, and 50 predictors. Each of these models was then used to predict the outcome (growth faltering) on the test dataset. Model performance was assessed using the receiver operating characteristic (ROC) curves and the cross-validated C-statistic (area under the ROC curve, AUC). The AUC describes how well a model can discriminate between a binary outcome in the test data from the cross-validated folds.
Calibration refers to a model’s ability to correctly estimate the risk of the outcome (Steyerberg and Vergouwe, 2014). We assessed model calibration both quantitatively and graphically (‘weak’ and ‘moderate’ calibration, respectively, Van Calster et al., 2019). First, we assessed calibration-in-the-large, or calibration intercept, by using logistic regression to estimate the mean while subtracting out the estimate (model the log-odds of the true status, offset by the CPR-predicted log-odds). Next, we used calibration slope to assess the spread of the estimated probabilities, whereby we fit a logistic regression model with log-odds of the true status as the dependent variable and CPR-predicted log-odds as the independent variable. Finally, we assessed moderate calibration graphically, whereby we calculated the predicted probability of growth faltering for each child in a given analysis using each iteration of each n-variable model fit. These predicted probabilities were then binned into deciles, and the proportion of each decile who truly experienced the outcome was calculated for each iteration of each n-variable model. The mean predicted probability and observed proportion were calculated for each decile across iterations. These average observed proportions were then plotted against averaged deciles for each n-variable model fit (see https://github.com/LeungLab/CPRgrowthfaltering for full analytic code).
Based on top predictors available in both GEMS and MAL-ED (see Results), the 2-variable GEMS-derived CPR of growth faltering was externally validated in MAL-ED data. A logistic regression was fit to all diarrhea cases age 0–23 months in GEMS data, with predictors chosen based on random forest. This model was then used to predict growth faltering in diarrhea cases in MAL-ED (age in MAL-ED converted from days to months), and discrimination and calibration were assessed as described above.
Sensitivity and subgroup analyses
We undertook additional sensitivity and subgroup analyses to explore if our ability to predict growth faltering improved in specific patient populations or with additional predictors within GEMS data. First, we explored age-strata specific CPRs for children 0–11, 12–23, 0–23, and 24–59 months. Second, we explored the predictive ability of MUAC instead of and in addition to HAZ. Third, we attempted to account for potential seasonal variation by adding a predictor for month of diarrhea. Fourth, we added indicator variables for the use of antibiotics before presentation (enrollment), while at clinic, prescription to take home after care, and ever. Fifth, we limited our outcome to only very severe growth faltering, defined as a decrease ≥1.0 HAZ (as opposed to ≥0.5 HAZ in the main analysis). Sixth, we explored the impact diarrhea etiology had on growth faltering prediction. We added variables for the presence/absence of Shigella, Cryptosporidium, Shigella + Cryptosporidium infections, and any viral etiology (defined as infection of any of the following: astrovirus, norovirus GII, rotavirus, sapovirus, and adenovirus 40/41). Etiology-specific infection were defined as an episode-specific attributable fraction (AFe) greater than or equal to a given cutoff (0.3, 0.5, and 0.7 were considered) (Kotloff et al., 2013). Seventh, we explored the prevalence of growth faltering in healthy controls, and identified top predictors and their ability to predict growth faltering in controls. Potential predictors related to diarrhea were not considered amongst controls (e.g., number of days with diarrhea at presentation). Eighth, we explored the role of stunting at presentation on growth faltering by limiting the CPR to only those who were not already stunted at presentation (HAZ ≥−2). Ninth, we fit a CPR predicting any stunting at follow-up, both among all presenting patients as well as limited to those NOT stunted at presentation. Finally, we conducted a quasi-external validation within the GEMS data by fitting a model to one continent and validating it on the other. All analysis was conducted in R 4.0.2 using the packages ‘ranger’, ‘cvAUC’, and ‘pROC’.
Results
Growth faltering in children following acute diarrhea in GEMS and MAL-ED
There were 9439 children with acute diarrhea enrolled in GEMS. In the analysis of the primary outcome (growth faltering), 110 observations were dropped for having follow-up measurements taken <49 or >91 days after enrollment, and 1276 were dropped for having implausible HAZ measurements, leaving an analytic sample of 8053. An addition 414 observations were dropped for having missing predictor data, as random forest analysis requires complete cases. Of the remaining 7639 children, 1744 (22.8%) experienced severe growth faltering (≥0.5 decrease in HAZ), and 357 (4.7%) experienced very severe growth faltering (≥1.0 decrease in HAZ) (Figure 1). Growth faltering rates differed by country, with Mozambique and The Gambia having the highest rates of growth faltering (34.5% and 31.9% experienced severe growth faltering, respectively) and Mali having the lowest rate (14.9%, Supplementary file 1). Growth faltering rates also varied by child’s age, with a higher proportion of younger children experiencing growth faltering than older children (Supplementary file 1).
In the analysis of MAL-ED data, we started with 6617 diarrhea episodes from 1390 children. In order to align with GEMS inclusion criteria and limit to acute onset diarrhea, 566 diarrhea episodes were dropped for having prolonged or persistent diarrhea (>7 days duration). An additional 125 episodes were dropped for having missing HAZ measurements or an HAZ follow-up measurement <49 or >95 days from diarrhea onset, and 138 episodes were dropped for having implausible HAZ measurements, leaving 5788 diarrhea episodes from 1350 children. An additional 105 observations were dropped for having missing predictor data. Of the remaining 5683 observations from 1322 children, 961 (16.9%) episodes experienced severe growth faltering (≥0.5 decrease in HAZ) and 161 (2.8%) episodes experienced very severe growth faltering (≥1.0 decrease in HAZ, Figure 1).
Derivation of a CPR to identify children who went on to severe growth faltering following acute diarrhea using GEMS data
After random forest screening of variables, logistic regression models consistently had higher AUCs than random forest regression models (Figure 2), therefore we only present the easier to interpret logistic regression results moving forward. In Table 1, we show the top 10 most predictive variables ranked from most to least important, for severe growth faltering (≥0.5 decrease in HAZ). The top predictive variables for severe growth faltering were: age, HAZ at enrollment, respiratory rate, temperature, number of people living in the household, number of people sleeping in the household, number of days of diarrhea at presentation, number of other households that share same fecal waste disposal facility (e.g., latrine), whether the child was currently breastfed at time of diarrhea, and the number of children <60 months old living in the household. The maximum AUC attained with the model was 0.75 (95% confidence interval [CI]: 0.75, 0.75) with a model of 20 variables, while an AUC of 0.71 (95% CI: 0.71, 0.72), 0.72 (95% CI: 0.72, 0.72), and 0.72 (95% CI: 0.72, 0.72) could be obtained with a CPR of 2, 5, and 10 variables, respectively (Figure 2). When limited to children 0–23 months of age, AUC decreased to 0.64 (95% CI: 0.64, 0.64) for 10 variables. In the full 10-variable model, we achieved a specificity of 0.47 at a sensitivity of 0.80 (Figure 2—figure supplement 1). The average predicted probability of growth faltering was consistently close to the average observed probability (calibration-in-the-large, or intercept), and the spread of predicted probabilities was similar to the spread of observed probabilities (calibration slope) for models including 1–10 predictor variables (Table 2, Figure 3, Figure 3—figure supplement 1).
Growth faltering.
Variable importance ordering and cross-validated average overall area under the curve (AUC) and AUC by patient subset and 95% confidence intervals for a 5 (bold) and 10 (italicized) variable logistic regression model for predicting growth faltering in children in 7 LMICs(Low- and middle-income countries) derived from Global Enteric Multicenter Study (GEMS) data (≥0.5 decrease in height-for-age z-score [HAZ] in children with acute diarrhea).
GEMS | MAL-ED | |||||
---|---|---|---|---|---|---|
Patient subset | 0–59 months (main text model) | 0–59 months (limit to only those NOT stunted at beginning (HAZ ≥−2) 5659/7639 (74.1%)) | 0–59 months limited to only those NOT stunted at beginning outcome is ANY stunting at follow-up (HAZ <−2) | 0–23 months (for external validation) | Healthy controls | 0–23 months |
AUCs | 0.72 (0.72, 0.72) | 0.71 (0.70, 0.72) | 0.90 (0.89, 0.91) | 0.64 (0.63, 0.65) | 0.79 (0.78, 0.79) | 0.67 (0.67, 0.68) |
0.72 (0.72, 0.72) | 0.71 (0.70, 0.72) | 0.90 (0.89, 0.90) | 0.64 (0.64, 0.64) | 0.79 (0.79, 0.79) | 0.68 (0.67, 0.69) | |
1 | Age (months) | Age (months) | HAZ | HAZ | Age (months) | HAZ |
2 | HAZ | HAZ | Age | Age (months) | HAZ | Age (days) |
3 | Respiratory rate | Respiratory rate | Respiratory rate | Temperature | Respiratory rate | Total days breastfeeding |
4 | Temperature | Temperature | Temperature | Respiratory rate | Temperature | Total days in all diarrheal episodes to date |
5 | Num. people living in household | Num. people living in household | Num. people living in household | Num. people living in household | Num. people living in household | Mean number of people per room |
6 | Num. rooms used for sleeping | Num. rooms used for sleeping | Num. days of diarrhea at presentation | Num. rooms used for sleeping | Breastfed | Days with diarrhea so far in this episode |
7 | Num. days of diarrhea at presentation | Num. days of diarrhea at presentation | Num. other households that share same fecal waste facility | Num. days of diarrhea at presentation | Num. rooms used for sleeping | Maternal education (years) |
8 | Num. other households that share same fecal waste facility | Breastfed | Num. rooms used for sleeping | Num. other households that share same fecal waste facility | Num. children <60 months live in household | Days since last diarrhea episode |
9 | Breastfed | Num. other households that share same fecal waste facility | Num. children <60 months live in household | Num. children <60 months live in household | Caregiver education | People sleeping in house |
10 | Num. children <60 months live in household | Num. children <60 months live in household | Caregiver education | Caregiver education | Num. other households share latrine | Max loose stools in this episode |
Calibration intercept and slope.
Number of predictor variables | GEMS 0–59 monthsIntercept (95% CI) | Slope (95% CI) | GEMS 0–23 months (for external validation)Intercept (95% CI) | Slope (95% CI) | MAL-ED 0–23 months Rederivationintercept (95% CI) | Slope (95% CI) | GEMS-derived model applied to MAL-ED dataIntercept (95% CI) | Slope (95% CI) |
---|---|---|---|---|---|---|---|---|
1 | 2.9 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.82, 1.2) | −1.0 × 10−2 (−0.14, 0.12) | 0.97 (0.62, 1.3) | 9.6 × 10−3 (−0.32, 0.32) | 1.0 (0.35, 1.7) | ||
2 | 3.6 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.84, 1.2) | −1.1 × 10−2 (−0.14, 0.12) | 0.98 (0.70, 1.3) | 1.1 × 10−2 (−0.33, 0.33) | 1.0 (0.51, 1.5) | −0.32 (−0.54, −0.11) | 1.5 (1.0, 2.1) |
3 | 3.6 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.84, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.97 (0.70, 1.2) | 1.1 × 10−2 (−0.33, 0.33) | 0.99 (0.51, 1.5) | ||
4 | 4.1 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.84, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.97 (0.71, 1.2) | 1.1 × 10−2 (−0.33, 0.33) | 0.97 (0.49, 1.5) | ||
5 | 4.2 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.83, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.96 (0.70, 1.2) | 1.1 × 10−2 (−0.33, 0.33) | 0.95 (0.48, 1.5) | ||
6 | 4.2 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.83, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.96 (0.70, 1.2) | 1.2 × 10−2 (−0.33, 0.33) | 0.94 (0.47, 1.5) | ||
7 | 4.3 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.83, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.96 (0.70, 1.2) | 1.2 × 10−2 (−0.33, 0.33) | 0.92 (0.47, 1.4) | ||
8 | 4.4 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.83, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.95 (0.69, 1.2) | 1.2 × 10−2 (−0.33, 0.33) | 0.92 (0.47, 1.4) | ||
9 | 4.7 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.83, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.95 (0.69, 1.2) | 1.2 × 10−2 (−0.33, 0.33) | 0.91 (0.47, 1.4) | ||
10 | 4.8 × 10−3 (−1.2 × 10−1, 1.3 × 10−1) | 1.0 (0.83, 1.2) | −1.2 × 10−2 (−0.14, 0.12) | 0.93 (0.69, 1.2) | 1.2 × 10−2 (−0.33, 0.33) | 0.89 (0.46, 1.4) |

Area under the curves (AUCs).
Cross-validated AUC achieved by number of predictive variables included in random forest regression and logistic regression models predicting growth faltering (≥0.5 decrease in height-for-age z-score [HAZ]) in children 0–59 months of age presenting with diarrhea in the Global Enteric Multicenter Study (GEMS).

2-Variable clinical prediction rule (CPR) for growth faltering: calibration curve and discriminative ability of 2-varaible (age, height-for-age z-score [HAZ] at enrollment) model predicting growth faltering (≥0.5 decrease in HAZ) in children presenting for acute diarrhea in LMICs.
Rederivation and external validation of a CPR to identify children who went on to severe growth faltering following acute diarrhea using MAL-ED data
We then derived a CPR for growth faltering using MAL-ED data, and found that the top predictors were similar to those identified using GEMS data, with age and HAZ at diarrhea being the top 2 predictors. Other top predictors in MAL-ED included breastfeeding, total days in all diarrhea episodes, mean number of people per room of home, days with diarrhea so far in this episode, number of years of maternal education, days since last diarrhea episode, number of people sleeping in house, and loose stools in this diarrhea episode (Table 1). The discriminative performance of the full model was similar to that found with GEMS (0.72 [95% CI: 0.72, 0.72] in GEMS, 0.68 [95% CI: 0.67, 0.69] in MAL-ED). The average predicted probability of growth faltering was consistently close to the average observed probability (calibration-in-the-large, or intercept). The spread of predicted probabilities (calibration slope) was slightly more extreme than observed probabilities, but there was no evidence they were different than 1.0 for models including 1–10 predictor variables (slope point estimates all 95% CI include 1.0, see Table 2, Figure 3—figure supplement 2).
Due to a lack of overlap in variables between datasets, we were unable to externally validate the 10-variable version of our growth faltering CPR. However, the top 2 predictors were available in both the GEMS and MAL-ED datasets. Therefore, we took the 2-variable CPR of growth faltering derived from children 0–23 months of age in GEMS, including HAZ at enrollment and age (the top 2 predictors), and externally validated it in MAL-ED data. The CPR had marginal discrimination in the GEMS data (AUC = 0.64, 95% CI: 0.64, 0.64), and a slight increase in discriminative ability at external validation in MAL-ED data (AUC = 0.68, 95% CI: 0.63, 0.74). On average, the CPR overestimated probability of growth faltering (calibration intercept −0.32, 95% CI: −0.54, −0.11), and predictions tended to be too moderate (calibration slope 1.5, 95% CI: 1.0, 2.1) (Table 2, Figure 3, Figure 3—figure supplement 3). Odds ratios for the 10-variable model predicting severe growth faltering in MAL-ED are shown in Supplementary file 1.
Addition of MUAC, diarrhea etiology, and antibiotic use did not meaningfully impact discriminative performance of CPR to identify children who went on to severe growth faltering following acute diarrhea in GEMS
Table 1 and Supplementary file 1 present the results of the growth faltering sensitivity analyses. Top predictor variables were highly consistent across models and included patient demographics, patient symptoms, and indicators of household wealth. CPRs of higher age strata had higher AUCs (0.76 [95% CI: 0.75, 0.77] in 24–59 months in GEMS vs. 0.60 [95% CI: 0.59, 0.60] in 0–11 months in GEMS).
When MUAC was considered as a potential predictor (instead of HAZ), MUAC replaced HAZ as a top predictor, all other top 10 predictors remained the same, and AUC decreased (down to 0.70, 95% CI: 0.70, 0.70). When both HAZ and MUAC were considered as potential predictors, both were top predictors, but AUC remained unchanged compared to the main model that considered only HAZ (0.72, 95% CI: 0.72, 0.73). The predictors of very severe growth faltering (≥1.0 decrease in HAZ) were similar to the predictors of severe growth faltering (≥0.5 decrease in HAZ), though predictive ability was better (AUC 0.80 [95% CI: 0.79, 0.80] for ≥1.0 vs. 0.72 [95% CI: 0.71, 0.73] for ≥0.5).
Accounting for seasonality did not meaningfully improve the CPR, and antibiotic use and diarrhea etiology were consistently not ranked as top predictors of growth faltering (Supplementary file 1). The addition of more predictor variables only marginally improved AUCs. When data were limited to only those children not already stunted at initially presentation, top predictors and AUC of growth faltering were similar (0.71, 95% CI: 0.70, 0.72). While the top predictor variables were similar, the CPR predicting any stunting at follow-up was noticeably higher, AUC = 0.90 (95% CI: 0.90, 0.91).
Derivation of a CPR to identify children without diarrhea (controls) who went on to severe growth faltering using GEMS data
Top predictors of growth faltering were similar in non-diarrhea controls compared to cases in GEMS (Table 1), but predictive ability was higher (AUC 0.79 [95% CI: 0.78, 0.79] in controls vs. 0.72 [95% CI: 0.72, 0.72] in cases). Again, top predictors were consistent with previous models and included age, HAZ at enrollment, respiratory rate, temperature, number of people living in household, breastfed, number of rooms used for sleeping, number of children under 60 months old who live in household, education level of primary caregiver, and number of other households that share same fecal waste disposal facility (e.g., latrine). The maximum AUC attained with the model was 0.79 (95% CI: 0.79, 0.80) with a model of 15 variables, while an AUC of 0.79 (95% CI: 0.78, 0.79) and 0.79 (95% CI: 0.79, 0.79) could be obtained with a CPR of 5 and 10 variables, respectively (Figure 3—figure supplement 4).
Discussion
By utilizing data from two large multicenter clinical studies of pediatric diarrhea, we used a combination of machine learning and conventional regression methods to derive and validate CPRs for linear growth faltering. The discriminative performance of our CPR for growth faltering was remarkably similar between the two datasets (AUC = 0.72, 95% CI: 0.72, 0.72, based on GEMS 0–59 months; 0.68, 95% CI: 0.67, 0.69 based on MAL-ED 0–24 months). We were then able to externally validate a 2-variable version, which also had similar discriminative ability between the datasets (AUC 0.64–0.68 for 0–23 and 0–24 months in GEMS and MAL-ED, respectively). Our findings suggest the potential for a parsimonious prediction rule-guided algorithm to identify young children with acute diarrhea for appropriate triage and follow-up.
The limited number of studies that aim to identify children most likely to growth falter after acute diarrhea have resulted in CPRs with varying discriminative and generalizability. Our full CPRs were better at identifying growth faltering than (Brander et al., 2019) (AUC = 0.67, 95% CI: 0.64, 0.69), which was not externally validated, and worse than (Hanieh et al., 2019) (AUC = 0.85, 95% CI: 0.80, 0.90), which only used data from a single country. The top predictors of growth faltering identified by random forests in our analysis were consistent with existing knowledge of the drivers of growth faltering – child demographics, child symptoms, and indicators of household wealth. The top 2 variables (used in our parsimonious externally validated CPR) were age and baseline HAZ. However, despite the inclusion of markers of disease severity (temperature, respiratory rate, number of days of diarrhea), overall ability to predict growth faltering was moderate, and consideration of additional factors related to nature of disease (etiology, antibiotics) did not improve discriminative ability. This is consistent with previous analysis in GEMS data that found treating diarrhea with antibiotics generally did not prevent growth faltering (except for Shigella infections, Nasrin et al., 2021).
Furthermore, the similar incidence of growth faltering in diarrhea cases and matched controls (particularly in the youngest children), as well as the almost identical predictive variables and similar AUCs, suggests that the impact of a single episode of acute diarrhea on growth trajectory may be relatively low. It is possible that the entire diarrheal history of a child (e.g., frequency and severity of acute diarrhea), or subclinical enteric infections that do not result in diarrhea, are more important to their growth trajectory than a single diarrheal episode, though evidence is mixed (Checkley et al., 2008; Rogawski et al., 2018; Deichsel et al., 2020). Indeed, while the design of GEMS does not allow for the exploration of this hypothesis, MAL-ED does. Total days in all diarrheal episodes, days with diarrhea so far this episode, and days since last diarrhea episode were all top 10 predictors of growth faltering in MAL-ED. Furthermore in GEMS, the average baseline HAZ at enrollment was 0.5 HAZ lower in children who did not experience growth faltering than in children who did (Figure 3—figure supplement 5), suggesting the possibility that children need to have high enough HAZ in order to have the potential to falter. In contrast, children enrolled in Mali had the highest median HAZ at enrollment, and also had the lowest proportion of children who experienced growth faltering (Supplementary file 1). It is also possible that the underlying cause(s) of stunting are complex and interrelated, and relatively simple predictive models are not able to accurately parse apart which children do and do not experience sufficient causes. In sensitivity analyses, we demonstrated our ability to predict any stunting at follow-up with high accuracy (Table 1, Supplementary file 1). However, this represents a related but distinct outcome from our original aim, namely a slowing down of growth as opposed to stunting, and may warrant different clinical intervention.
While effective interventions exist for treating acute malnutrition (e.g., exclusive breastfeeding for the first 6 months of life, inpatient- and community-based management of acute malnutrition using corn-soy blend or ready-to-use therapeutic food; WHO, 2013; Bergeron and Castleman, 2012; Keats et al., 2021), there are few evidence-based guidance on how to reverse the effects of chronic malnutrition once a child is stunted (Bergeron and Castleman, 2012; Leroy et al., 2015; Reinhardt and Fanzo, 2014; Pavlinac et al., 2018; WHO, 2015). We found that approximately one in five children experience severe growth faltering subsequent to acute diarrhea, that is, an additional ≥0.5 decrease in HAZ in the 2–3 months after acute diarrhea. Currently, presenting to care for an acute illness, such as diarrhea, offers an opportunity for medical personnel to assess and treat children for acute malnutrition through intensive feeding programs. Our CPR provides a tool for identifying patients likely to experience additional growth faltering after acute diarrhea. Current malnutrition recommendations are based on patient presentation – whether a child is underweight when they present to the clinic. Our CPR could be used to identify children not currently stunted and therefore not currently recommended for nutritional interventions, but who are likely to slow down in growth and therefore at higher risk of incident stunting. Identifying these children would allow clinicians to connect patients with community-based nutrition interventions (e.g., maternal support for safe introduction of weening foods, small quantity lipid nutrient supplements, etc.; Bhutta et al., 2013; Bhutta et al., 2008; Cole, 2020; Zhang et al., 2021) to prevent additional effects of chronic malnutrition, namely irreversible stunting. Given our ability to predict growth faltering in healthy controls in GEMS, community screening for those at risk of growth faltering (not just those presenting with acute diarrhea) may also be prudent. This would represent a different potential intervention strategy and future research should explore this possibility further.
Our study has a number of strengths and limitations. We derived CPRs for growth faltering from two multisite, prospective studies that included longitudinal follow-up with extensive etiologic testing. Unlike previous work in this area, we used random forests for variable selection which do not require assumptions about relationships between the underlying variables and generally outperform (Singal et al., 2013) conventional model building techniques. While our complete-case analysis strategy could introduce bias due to missing data, we were able to re-derive the 10-variable version in two distinct datasets with similar results. While we were only able to externally validate a 2-variable version of our growth faltering CPR, its discriminative performance was similar to the full 10-variable version, and was robust to external validation. Furthermore, while the observation windows were large for many variables in the MAL-ED dataset used for external validation (up to 90 days for dietary variables, and up to 6 months for household descriptors), the variables of interest in the 2-variable CPR were observed no more than 31 days from the start of diarrhea. In addition, we considered all diarrhea as an outcome of interest in MAL-ED, whereas the analysis in GEMS was limited to MSD. When limiting the MAL-ED analysis to MSD as defined in GEMS, the top predictors and discriminative ability were very similar. The quasi-external validation between continents within GEMS data, as well as the country-specific models within GEMS, all had similar top predictors and discriminative performance, further supporting the overall validity of our CPR. Finally, we explored a range of AFe cutoffs for etiology, with consistent results.
Our study can also serve as a guide for future CPR development. We used a prediction-based approach for variable selection, and compared multiple model fitting strategies. We assessed model calibration as well as discrimination, and reported the results from numerous sensitivity analyses. Finally, we designed our study a priori to incorporate external validation, lending additional confidence to the generalizability of our results.
In conclusion, we used data from two large multi-country studies to derive and validate a CPR for growth faltering in children presenting for diarrhea treatment. Our findings indicate that use of prediction rules, potentially applied as clinical decision support tools, could help to identify additional children at risk of poor outcomes after an episode of diarrheal illness, that is not currently stunted but likely to decelerate growth. In settings with high mortality and morbidity in early childhood, such tools could represent a cost-effective way to target resources toward those who need it most.
Data availability
GEMS and MAL-ED data are available to the public by request through the following website https://clinepidb.org/ce/app/. Data cleaning and statistical code needed to reproduce all parts of this analysis are available from the corresponding author’s GitHub page: https://github.com/LeungLab/CPRgrowthfaltering, (copy archived at swh:1:rev:f3fd53b5713ef787d3ae2cd4a81f3286f52f2746, Ahmed, 2022).
Data availability
This is a secondary analysis of the GEMS and MAL-ED datasets. These data are available to the public through the following website https://clinepidb.org/ce/app/. Data requests are submitted through the website listed, and requests are reviewed and approved by the investigators of those original studies consistent with their protocols and data sharing policies. As of the time of submission of this manuscript, the GEMS Data Access Request asked for purpose, hypothesis/research question, analysis plan, dissemination plan, and if anyone from the GEMS study team had already been approached regarding this request. The MAL-ED data were available for download without submitting a Data Access Request. Data cleaning and statistical code needed to reproduce all parts of this analysis are available from the corresponding author's GitHub page: https://github.com/LeungLab/CPRgrowthfaltering, (copy archived at swh:1:rev:f3fd53b5713ef787d3ae2cd4a81f3286f52f2746). The following previously published datasets were used: Dataset 1: Gates Enterics Project, Levine MM, Kotloff K, Nataro J, Khan AZA, Saha D, Adegbola FR, Sow S, Alonso P, Breiman R, Sur D, Faruque A. 2018. Study GEMS1 Case Control. https://clinepidb.org/ce/app/record/dataset/DS_841a9f5259#Contacts. Database and Identifier: ClinEpiDB, DS_841a9f5259 Dataset 2: The Etiology, Risk Factors, and Interactions of Enteric Infections and Malnutrition and the Consequences for Child Health (MAL-ED). Primary Contact: David Spiro, Fogarty International Center, National Institutes of Health, Bethesda, MD, USA. https://clinepidb.org/ce/app/workspace/analyses/DS_5c41b87221/new/details.
References
-
SoftwareCPRgrowthfaltering, version swh:1:rev:f3fd53b5713ef787d3ae2cd4a81f3286f52f2746Software Heritage.
-
Program responses to acute and chronic malnutrition: divergences and convergencesAdvances in Nutrition 3:242–249.https://doi.org/10.3945/an.111.001263
-
Micronutrient malnutrition, infection, and immunity: an overviewNutrition Reviews 60:S40–S55.https://doi.org/10.1301/00296640260130722
-
Immune dysfunction as a cause and consequence of malnutritionTrends in Immunology 37:386–398.https://doi.org/10.1016/j.it.2016.04.003
-
Multi-country analysis of the effects of diarrhoea on childhood stuntingInternational Journal of Epidemiology 37:816–830.https://doi.org/10.1093/ije/dyn099
-
Optimizing interventions to prevent chronic malnutrition: the search for the Holy GrailThe Journal of Pediatrics 222:17–18.https://doi.org/10.1016/j.jpeds.2020.03.008
-
The world Health organization’s global target for reducing childhood stunting by 2025: rationale and proposed actionsMaternal & Child Nutrition 9 Suppl 2:6–26.https://doi.org/10.1111/mcn.12075
-
Childhood stunting: a global perspectiveMaternal & Child Nutrition 12 Suppl 1:12–26.https://doi.org/10.1111/mcn.12231
-
An Introduction to Statistical Learning with Applications in RAn introduction to statistical learning, An Introduction to Statistical Learning with Applications in R, New York, NY, Springer, 10.1007/978-1-4614-7138-7.
-
Effective interventions to address maternal and child malnutrition: an update of the evidenceThe Lancet. Child & Adolescent Health 5:367–384.https://doi.org/10.1016/S2352-4642(20)30274-1
-
The MAL-ED study: a multinational and multidisciplinary approach to understand the relationship between enteric pathogens, malnutrition, gut physiology, physical growth, cognitive development, and immune responses in infants and children up to 2 years of age in resource-poor environmentsClinical Infectious Diseases 59:S193–S206.https://doi.org/10.1093/cid/ciu653
-
The effect of multiple anthropometric deficits on child mortality: meta-analysis of individual data in 10 prospective studies from developing countriesThe American Journal of Clinical Nutrition 97:896–901.https://doi.org/10.3945/ajcn.112.047639
-
Pathogens associated with linear growth faltering in children with diarrhea and impact of antibiotic treatment: the global enteric multicenter studyThe Journal of Infectious Diseases 224:S848–S855.https://doi.org/10.1093/infdis/jiab434
-
Methods of analysis of enteropathogen infection in the MAL-ED cohort studyClinical Infectious Diseases 59 Suppl 4:S233–S238.https://doi.org/10.1093/cid/ciu408
-
Disease surveillance methods used in the 8-site MAL-ED cohort studyClinical Infectious Diseases 59 Suppl 4:S220–S224.https://doi.org/10.1093/cid/ciu435
-
Enteric dysfunction and other factors associated with attained size at 5 years: MAL-ED birth cohort study findingsThe American Journal of Clinical Nutrition 110:131–138.https://doi.org/10.1093/ajcn/nqz004
-
Determinants and impact of Giardia infection in the first 2 years of life in the MAL-ED birth cohortJournal of the Pediatric Infectious Diseases Society 6:153–160.https://doi.org/10.1093/jpids/piw082
-
Use of quantitative molecular diagnostic methods to investigate the effect of enteropathogen infections on linear growth in children in low-resource settings: longitudinal analysis of results from the MAL-ED cohort studyThe Lancet. Global Health 6:e1319–e1328.https://doi.org/10.1016/S2214-109X(18)30351-6
-
Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinomaThe American Journal of Gastroenterology 108:1723–1730.https://doi.org/10.1038/ajg.2013.332
-
Maternal and environmental risk for faltered growth in the first 5 years for tanjungsari children in West Java, IndonesiaAsia Pacific Journal of Clinical Nutrition 28:S32–S42.https://doi.org/10.6133/apjcn.201901_28(S1).0003
-
Towards better clinical prediction models: seven steps for development and an ABCD for validationEuropean Heart Journal 35:1925–1931.https://doi.org/10.1093/eurheartj/ehu207
-
ReportLevels and trends in child malnutrition: key findings of the 2021 edition of the joint child malnutrition estimatesGeneva: World Health Organization.
-
ReportLevels and Trends in Child Malnutrition: Key Findings of the 2021 Edition of the Joint Child Malnutrition EstimatesGeneva: World Health Organization.
-
BookGuideline: Updates on the Management of Severe Acute Malnutiriton in Infants and ChildrenGeneva: World Health Organization.
Decision letter
-
Eduardo FrancoSenior and Reviewing Editor; McGill University, Canada
-
Andrew N MertensReviewer; University of California, Berkeley, United States
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "Derivation and external validation of clinical prediction rules identifying children at risk of linear growth faltering (stunting) presenting for diarrheal care" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by me in my joint role as Reviewing Editor and Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Andrew N Mertens (Reviewer #2).
As is customary in eLife, the reviewers have discussed their critiques with one another and with the Editors, and I prepared this decision letter to help you prepare a revised submission. Given the extent of the suggestions, I prefer to provide you with a compilation of the relevant suggestions in the two critiques to assist you in preparing the revisions for an eventual resubmission.
Essential revisions:
Although we expect that you will address these comments in your response letter, we also need to see the corresponding revision clearly marked in the text of the manuscript. Some of the reviewers' comments may seem to be simple queries or challenges that do not prompt revisions to the text. Please keep in mind, however, that readers may have the same perspective as the reviewers. Therefore, it is essential that you amend or expand the text to clarify the narrative accordingly.
The outcome used for prediction in a binary indicatory for a decrease in height-for-age Z-score >= 0.5. A child who fails to gain height by future measurements is of concern, but this outcome also misses children who are already experiencing growth failure, and is vulnerable to regression to the mean effect. The two most important predictors were age and current size, with current size having a positive association with risk of growth faltering. As mentioned in the discussion, there is "the possibility that children need to have high enough HAZ in order to have the potential to falter." Additionally, there may be children with erroneously high height measurements at the first measurement, so that the HAZ change >= 0.5 associated with high baseline HAZ is from measurement-error regression to the mean. I recommend also predicting absolute HAZ (or stunting status) as a secondary outcome and comparing if the important predictors change. Were alternative specifications (e.g., quantitative decrease in HAZ, incident stunting) considered?
In its current form, the results and conclusions from the results have problematic implications for the treatment of child malnutrition. The conclusion states: "In settings with high mortality and morbidity in early childhood, such tools could represent a cost-effective way to target resources towards those who need it most." If the current CPR was used in a resource-constrained setting, it would recommend that larger children should be prioritized for nutritional supplementation over already stunted children who may have reached their growth faltering floor. In addition, with a sensitivity of 80%, the tool would miss treating a large number of children who would experience growth faltering. The results of the clinical prediction tool need to be presented with care in how it could be used to prioritize treatment without missing treating children who would benefit from nutritional supplementation. Including absolute HAZ as an outcome will help, along with additional discussion of how the CPR fits alongside current treatment recommendations. For example, does this rule indicate treating children who aren't currently treated, or are there children who don't need treatment given current guidelines and the created CPR.
The results from these datasets may not have identified novel and strong predictors of growth faltering, as the current results indicate that additional predictors beyond current size and age don't help with predictions, but the analysis could be reframed as a template for developing a CPR, using this data as a case study. If age and current size are the only important predictors, then a simple rule based on age and a current HAZ cutoff could be created, negating the need for a more complicated model, but this manuscript also provides a good template for other clinical prediction analyses. Could you comment more on the methods and performance metrics used in the discussion, and make a recommendation for future analyses?
Why use the MAL-ED data to externally validate the CPR developed using GEMS data, given the different study designs, definitions of diarrheal disease, and predictors measured. Because GEMS is a multisite study, wouldn't it be easier, and allow more complex models to be validated, if the model was fit using data from some countries, then validated in populations from other countries?
In addition to the coefficients for the 10-variable model, it would be helpful to present coefficients for the final 2-variable model that was assessed in both GEMS and MAL-ED.
Although the authors opted to use logistic regression based on AUC, the AUC values for random forest models were only slightly lower (Figure S2), and random forest may provide simpler clinical prediction rules. It may be interesting to also describe the rules that were developed by the random forest models. The last panel in Figure S2 may be mislabeled (0-23 mo for MAL-ED instead of 0-59 mo).
I am not very familiar with the variable importance calculated from random forest models. What is the implication of certain features having high variable importance, but also having coefficient estimates that are indistinguishable from the null (e.g., age in MAL-ED, respiratory rate in GEMS in Table S4)?
In the Discussion (p.20), the authors note that the entire diarrheal history of a child may be a more important indicator of linear growth faltering than a single episode. These datasets seem potentially well-suited to directly explore this question ¬- were frequency/number of prior diarrheal episodes investigated as predictors in GEMS / MAL-ED?
For reproducibility, please specify the software and key packages with corresponding versions that were used for this analysis.
The best performing model was logistic regressions fit with variables chosen by random forest models. Any idea why this would be? Is it because they are simpler and the random forest models are overfit to the training data? I would expect them to perform worse because they don't allow for nonlinearity and interactions like a RF model. If generalized linear models perform better than random forest for prediction in this situation, penalized logistic regression models may also improve predictive performance by incorporating variable selection with prediction in a simpler model than random forests.
The conclusion in the abstract is "Our findings indicate that use of prediction rules could help identify children at risk of poor outcomes after an episode of diarrheal illness", but prediction performance is the same in control children, so while its important to retain the discussion of lack of association between diarrhea and growth, the framing of the paper could be expanded around all children in LMIC, rather than just children with acute diarrhea. This could just be a slight reframing in the writing, or you could expand the MAL-ED prediction model to use all children in addition to the prediction on the subset of children with diarrhea.
What is the rationale for comparing HAZ and MUAC as separate and combined predictors of growth? On one hand, it's interesting to compare which current measures of anthropometry are most associated with future measures of anthropometry, in which case you'd want to include other outcomes such as WHZ, WAZ, and MUAC. But if the goal is to develop the best clinical prediction tool, it makes more sense to include all measures of growth that can be easily clinically collected as predictors to see if performance increases by including WHZ, WAZ, and MUAC on top of HAZ.
Line 125-128: "Model performance was assessed using the receiver operating characteristic (ROC) curves and the cross-validated C-statistic (area under the ROC curve (AUC)), a measures which describes how well a model can discriminate between the two outcomes, from the cross-validation." Confusingly worded… do you mean "AUC is a measure which describes how well a model can predict a binary outcome in test data from the cross-validated folds."
Line 129-142: Model calibration performance metrics: these were new to me, and I wasn't sure what to be looking for or what story they could tell us about model performance beyond the AUC. What is the reader looking for? Can they tell us something different than the AUC?
Line 173: separately report missing versus implausible values, because the percent implausible gives an indication of data quality.
Lines 177-182: Report mean HAZ by country as well to show if it there is lower growth faltering in some countries because of high existing stunting by the age of first measurement.
Line 199: This is the first mention of death as an outcome (and the results of the CPR for death are not discussed).
Page 20: "It is possible that the entire diarrheal history of a child (e.g. frequency and severity of acute diarrhea), or subclinical enteric infections that do not result in diarrhea, are more important to their growth trajectory than a single diarrheal episode, though evidence is mixed." As you have longitudinal data from MAL-ED, can't you explicitly check this by using diarrhea history as a predictor?
Page 21: "Unlike previous work in this area, we used random forests for variable selection which do not require assumptions about the underlying variables and generally outperform(49) conventional model building techniques."
– Need to clarify that random forests have no assumptions about the relationship between variables, not about the variables themselves, which still have assumptions around how they are coded/categorized.
Tables S4- Age is the most important predictor, but the OR is 1 with 1,1 confidence intervals. Can you convert the predictor to age in months or report more decimal places so direction of effect can be seen?
https://doi.org/10.7554/eLife.78491.sa1Author response
Essential revisions:
Although we expect that you will address these comments in your response letter, we also need to see the corresponding revision clearly marked in the text of the manuscript. Some of the reviewers' comments may seem to be simple queries or challenges that do not prompt revisions to the text. Please keep in mind, however, that readers may have the same perspective as the reviewers. Therefore, it is essential that you amend or expand the text to clarify the narrative accordingly.
The outcome used for prediction in a binary indicatory for a decrease in height-for-age Z-score >= 0.5. A child who fails to gain height by future measurements is of concern, but this outcome also misses children who are already experiencing growth failure, and is vulnerable to regression to the mean effect. The two most important predictors were age and current size, with current size having a positive association with risk of growth faltering. As mentioned in the discussion, there is "the possibility that children need to have high enough HAZ in order to have the potential to falter." Additionally, there may be children with erroneously high height measurements at the first measurement, so that the HAZ change >= 0.5 associated with high baseline HAZ is from measurement-error regression to the mean. I recommend also predicting absolute HAZ (or stunting status) as a secondary outcome and comparing if the important predictors change. Were alternative specifications (e.g., quantitative decrease in HAZ, incident stunting) considered?
Thank you for this suggestion. We have added additional models for the following predictions: (a) growth faltering in those NOT stunted (HAZ≥-2) at presentation, (b) any stunting (HAZ<-2) at follow-up, and (c) any stunting at follow-up in those not stunted at presentation.
While we agree the addition of these models improves the manuscript, we also want to highlight that these models have distinct outcomes and therefore have separate clinical uses. Our original goal was to identify children whose growth was likely to slow down after diarrhea. As we show, top predictors and predictive performance is similar for growth faltering across baseline stunting status. We present any stunting at follow-up as a comparison, but argue that this is a different clinical outcome that may warrant different intervention.
P.22 L.335-339: “In sensitivity analyses, we demonstrated our ability to predict any stunting at follow-up with high accuracy (Table 1, Supplementary file 1E). However, this represents a related but distinct outcome from our original aim, namely a slowing down of growth as opposed to stunting, and may warrant different clinical intervention.”
P.23 L.349-353: “Current malnutrition recommendations are based on patient presentation – whether a child is underweight when they present to the clinic. Our CPR could be used to identify children not currently stunted and therefore not currently recommended for nutritional interventions, but who are likely to slow down in growth and therefore at higher risk of incident stunting.”
In its current form, the results and conclusions from the results have problematic implications for the treatment of child malnutrition. The conclusion states: "In settings with high mortality and morbidity in early childhood, such tools could represent a cost-effective way to target resources towards those who need it most." If the current CPR was used in a resource-constrained setting, it would recommend that larger children should be prioritized for nutritional supplementation over already stunted children who may have reached their growth faltering floor. In addition, with a sensitivity of 80%, the tool would miss treating a large number of children who would experience growth faltering. The results of the clinical prediction tool need to be presented with care in how it could be used to prioritize treatment without missing treating children who would benefit from nutritional supplementation. Including absolute HAZ as an outcome will help, along with additional discussion of how the CPR fits alongside current treatment recommendations. For example, does this rule indicate treating children who aren't currently treated, or are there children who don't need treatment given current guidelines and the created CPR.
We thank the Reviewers for pointing out this oversight. We have edited the Discussion for clarity as follows.
P.23 L.348-357: “Our CPR provides a tool for identifying patients likely to experience additional growth faltering after acute diarrhea. Current malnutrition recommendations are based on patient presentation – is a child underweight when they come to the clinic. Our CPR could be used to identify children not currently stunted and therefore not currently recommended for nutritional interventions, but who are likely to slow down in growth and therefore at higher risk of incident stunting. Identifying these children would allow clinicians to connect patients with community-based nutrition interventions (e.g. maternal support for safe introduction of weening foods, small quantity lipid nutrient supplements (SQ-LNS), etc. (45-48)) to prevent additional effects of chronic malnutrition, namely irreversible stunting.”
P.25 L.386-389: “Our findings indicate that use of prediction rules, potentially applied as clinical decision support tools, could help to identify additional children at risk of poor outcomes after an episode of diarrheal illness, i.e. not currently stunted but likely to decelerate growth.”
The results from these datasets may not have identified novel and strong predictors of growth faltering, as the current results indicate that additional predictors beyond current size and age don't help with predictions, but the analysis could be reframed as a template for developing a CPR, using this data as a case study. If age and current size are the only important predictors, then a simple rule based on age and a current HAZ cutoff could be created, negating the need for a more complicated model, but this manuscript also provides a good template for other clinical prediction analyses. Could you comment more on the methods and performance metrics used in the discussion, and make a recommendation for future analyses?
Thank you for this suggested. We have edited the Discussion as below:
P.24 L.380-384: “Our study can also serve as a guide for future CPR development. We used a prediction-based approach for variable selection, and compared multiple model fitting strategies. We assessed model calibration as well as discrimination, and reported the results from numerous sensitivity analyses. Finally, we designed our study a priori to incorporate external validation, lending additional confidence to the generalizability of our results.”
Why use the MAL-ED data to externally validate the CPR developed using GEMS data, given the different study designs, definitions of diarrheal disease, and predictors measured. Because GEMS is a multisite study, wouldn't it be easier, and allow more complex models to be validated, if the model was fit using data from some countries, then validated in populations from other countries?
We thank the Reviewers for this suggestion and have added country-specific CPRs in the Supplement. We have also added a sensitivity analysis whereby we fit models to all data from one continent in GEMS, and then validated that model on the other continent in GEMS data. As you can see from Supplementary file 1E top predictors and discriminative performance were similar across countries and continents
P.10 L.171-173: “Finally, we conducted a quasi-external validation within the GEMS data by fitting a model to one continent and validating it on the other.”
P.24 L.376-379: “The quasi-external validation between continents within GEMS data, as well as the country-specific models within GEMS, all had similar top predictors and discriminative performance, further supporting the overall validity of our CPR. Finally, we explored a range of AFe cutoffs for etiology, with consistent results.”
In addition to the coefficients for the 10-variable model, it would be helpful to present coefficients for the final 2-variable model that was assessed in both GEMS and MAL-ED.
We have added this as requested to the Supplementary file 1D.
Although the authors opted to use logistic regression based on AUC, the AUC values for random forest models were only slightly lower (Figure S2), and random forest may provide simpler clinical prediction rules. It may be interesting to also describe the rules that were developed by the random forest models. The last panel in Figure S2 may be mislabeled (0-23 mo for MAL-ED instead of 0-59 mo).
Thank you for the correction, we have edited the figure as appropriate. All of our Results are presented in our manuscript and Supplement, we did not develop additional CPRs.
I am not very familiar with the variable importance calculated from random forest models. What is the implication of certain features having high variable importance, but also having coefficient estimates that are indistinguishable from the null (e.g., age in MAL-ED, respiratory rate in GEMS in Table S4)?
We appreciate this is a very important question. As described on P.7 L116-118, we defined variable importance as mean squared prediction error achieved by including the variable in the predictive model. In other words, we selected variables based on how well they improved the predictive performance of the overall model. This is a different analytic goal than testing the hypothesis that a variable is or is not associated with the outcome (e.g. does the confidence interval for the odds ratio cross 1). If our goal had been to explore the association between potential risk factors and growth faltering, we would have implemented a different variable selection process. Please refer to Shmueli 2010 and van Diepen 2017 for additional details.
In the Discussion (p.20), the authors note that the entire diarrheal history of a child may be a more important indicator of linear growth faltering than a single episode. These datasets seem potentially well-suited to directly explore this question ¬- were frequency/number of prior diarrheal episodes investigated as predictors in GEMS / MAL-ED?
We thank the Reviewers for bringing this omission to our attention. The study design of GEMS does not make it possible to assess history of previous diarrhea episodes. There are a number of variables that approximate this in MAL-ED, which have already been considered as potential predictors in the re-derivation. We have the Discussion as follows to incorporate this.
P.22 L.322-328: “It is possible that the entire diarrheal history of a child (e.g. frequency and severity of acute diarrhea), or subclinical enteric infections that do not result in diarrhea, are more important to their growth trajectory than a single diarrheal episode, though evidence is mixed (13, 26, 37). Indeed, while the design of GEMS does not allow for the exploration of this hypothesis, MAL-ED does. Total days in all diarrheal episodes, days with diarrhea so far this episode, and days since last diarrhea episode were all top-10 predictors of growth faltering in MAL-ED.”
For reproducibility, please specify the software and key packages with corresponding versions that were used for this analysis.
Thanks for this suggestion, we have added as below.
P.10 L.173-174: “All analysis was conducted in R 4.0.2 using the packages “ranger,” “cvAUC,” and “pROC.””
The best performing model was logistic regressions fit with variables chosen by random forest models. Any idea why this would be? Is it because they are simpler and the random forest models are overfit to the training data? I would expect them to perform worse because they don't allow for nonlinearity and interactions like a RF model. If generalized linear models perform better than random forest for prediction in this situation, penalized logistic regression models may also improve predictive performance by incorporating variable selection with prediction in a simpler model than random forests.
We agree that exploring alternative model building strategies could prove fruitful. However, incorporating variable selection and prediction into a single model building process such as with ridge regression or elastic net could lead to a more complicated final model, as less important coefficients approach but do not reach 0. In any case, an exhaustive comparison between model building strategies was beyond the scope of this study.
The conclusion in the abstract is "Our findings indicate that use of prediction rules could help identify children at risk of poor outcomes after an episode of diarrheal illness", but prediction performance is the same in control children, so while its important to retain the discussion of lack of association between diarrhea and growth, the framing of the paper could be expanded around all children in LMIC, rather than just children with acute diarrhea. This could just be a slight reframing in the writing, or you could expand the MAL-ED prediction model to use all children in addition to the prediction on the subset of children with diarrhea.
We thank the Reviewer for this suggestion. We feel it is important to retain the focus on acute diarrhea, as this represents an easy point of access to identify children who are struggling. A community screening program that would utilize a CPR predicting growth faltering in both symptomatic and non symptomatic children could also be beneficial, but is a different goal and would fit in a different type of intervention than our original research goal. Per your suggestion, we have edited the Abstract and Discussion as follows to highlight this possibility.
P.2: “Abstract Conclusions: Our findings indicate that use of prediction rules could help identify children at risk of poor outcomes after an episode of diarrheal illness. They may also be generalizable to all children, regardless of diarrhea status.”
P.23 L357-360: “Given our ability to predict growth faltering in healthy controls in GEMS, community screening for those at risk of growth faltering (not just those presenting with acute diarrhea) may also be prudent. This would represent a different potential intervention strategy and future research should explore this possibility further.”
What is the rationale for comparing HAZ and MUAC as separate and combined predictors of growth? On one hand, it's interesting to compare which current measures of anthropometry are most associated with future measures of anthropometry, in which case you'd want to include other outcomes such as WHZ, WAZ, and MUAC. But if the goal is to develop the best clinical prediction tool, it makes more sense to include all measures of growth that can be easily clinically collected as predictors to see if performance increases by including WHZ, WAZ, and MUAC on top of HAZ.
Our goal was to develop the best clinical prediction tool in terms of predictive ability AND ease of use. As we show in Supplementary file 1E, the model considering HAZ as the only growth metric performed better than the model considering only MUAC, and performed just as well as the model considering both. Therefore, we concluded the most predictive and parsimonious model included HAZ as the only growth metric. Because our goal was to develop a clinical prediction rule for pediatric patients with acute diarrhea, we chose not to consider WHZ and WAZ. Child weight is highly susceptible to dehydration status, especially in the youngest children (Modi 2015). Therefore, weight-based growth metrics can be highly inaccurate during acute diarrheal illness.
Line 125-128: "Model performance was assessed using the receiver operating characteristic (ROC) curves and the cross-validated C-statistic (area under the ROC curve (AUC)), a measures which describes how well a model can discriminate between the two outcomes, from the cross-validation." Confusingly worded… do you mean "AUC is a measure which describes how well a model can predict a binary outcome in test data from the cross-validated folds."
Thanks for this suggestion, edits below
P.8 L.125-128: “Model performance was assessed using the receiver operating characteristic (ROC) curves and the cross-validated C-statistic (area under the ROC curve (AUC)). The AUC describes how well a model can discriminate between a binary outcome in the test data from the cross-validated folds.”
Line 129-142: Model calibration performance metrics: these were new to me, and I wasn't sure what to be looking for or what story they could tell us about model performance beyond the AUC. What is the reader looking for? Can they tell us something different than the AUC?
Model discrimination (e.g. AUC) and model calibration are indeed two different metrics for evaluating predictive performance. Discrimination refers to a model’s ability to correctly separate who does and does not experience the outcome of interest. Calibration refers to a model’s ability to correctly estimate the risk of the outcome. In other words, how similar is the predicted number of events to the observed number of events. We have added the following brief explanation to the manuscript and included an addition citation for interested readers.
P.8 L.129-130: “Calibration refers to a model’s ability to correctly estimate the risk of the outcome(34). We assessed model calibration both quantitatively and graphically… “
Line 173: separately report missing versus implausible values, because the percent implausible gives an indication of data quality.
We have edited as follows.
P.10 L.177-180: “There were 9439 children with acute diarrhea enrolled in GEMS. In the analysis of the primary outcome (growth faltering), 110 observations were dropped for having follow-up measurements taken <49 or >91 days after enrollment, and 1276 were dropped for having implausible HAZ measurements, leaving an analytic sample of 8053.”
Lines 177-182: Report mean HAZ by country as well to show if it there is lower growth faltering in some countries because of high existing stunting by the age of first measurement.
We thank the Reviewers for this excellent suggestion, and have added the requested data in Supplementary file 1B. We have edited the Discussion as follows.
P.22 L.328-333: “Furthermore, the average baseline HAZ at enrollment was 0.5 HAZ lower in children who did not experience growth faltering than in children who did (Supplemental Figure S4), suggesting the possibility that children need to have high enough HAZ in order to have the potential to falter. In contrast, children enrolled in Mali had the highest median HAZ at enrollment, and also had the lowest proportion of children who experienced growth faltering (Supplementary file 1B).”
Line 199: This is the first mention of death as an outcome (and the results of the CPR for death are not discussed).
This was an oversight from a previous draft of the manuscript and has been removed. Thanks for pointing this out.
Page 20: "It is possible that the entire diarrheal history of a child (e.g. frequency and severity of acute diarrhea), or subclinical enteric infections that do not result in diarrhea, are more important to their growth trajectory than a single diarrheal episode, though evidence is mixed." As you have longitudinal data from MAL-ED, can't you explicitly check this by using diarrhea history as a predictor?
Please see response and edits listed above.
Page 21: "Unlike previous work in this area, we used random forests for variable selection which do not require assumptions about the underlying variables and generally outperform(49) conventional model building techniques."
– Need to clarify that random forests have no assumptions about the relationship between variables, not about the variables themselves, which still have assumptions around how they are coded/categorized.
Thank you for pointing this out, we have edited as follows.
P.24 L.363-365: “Unlike previous work in this area, we used random forests for variable selection which do not require assumptions about relationships between the underlying variables and generally outperform(50) conventional model building techniques.”
Tables S4- Age is the most important predictor, but the OR is 1 with 1,1 confidence intervals. Can you convert the predictor to age in months or report more decimal places so direction of effect can be seen?
As discussed earlier in this Response to Reviewers, the goal of our CPRs was prediction, not to assess the association or effect of a risk factor on the outcome. Therefore, it is inappropriate, i.e. the analytic strategy does not support, to judge the relationship between risk factors and the outcome using these CPR models.
References
Shmueli, G. To explain or to predict? Statist. Sci. 25(3): 289-310 (August 2010). DOI:10.1214/10-STS330. https://arxiv.org/pdf/1101.0891.pdf.
van Diepen, M., Ramspek, C.L., Jager, K.J., Zoccali, C., Dekker, Friedo W. Predictionversus aetiology: common pitfalls and how to avoid them. Nephrol DialTransplant. 2017 Apr 1;32. PMID: 28339854.
Modi, P., Nasrin,S., Hawes, M., Glavis-Bloom, J., Alam N.H., Hossain, M.I., Levine, A.C. Midupper arm circumference outperforms weight-based measures of nutritionalstatus in children with diarrhea. J. Nutr. 2015 Jul; 145(7):1582-7. PMID:25972523.
https://doi.org/10.7554/eLife.78491.sa2Article and author information
Author details
Funding
National Institute of Allergy and Infectious Diseases (R01AI135114)
- Sharia M Ahmed
- Ben J Brintz
- Daniel T Leung
National Institutes of Health (T32AI055434)
- Sharia M Ahmed
The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.
Acknowledgements
This work was supported by National Institutes of Health under Ruth L Kirschstein National Research Service Award NIH T32AI055434 and by the National Institute of Allergy and Infectious Diseases (R01AI135114). The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.
Ethics
Parents or caregivers of participants provided informed consent, either in writing or witnessed if parents or caregivers were illiterate. The GEMS study protocol was approved by ethical review boards at each field site and the University of Maryland, Baltimore, USA. The MAL-ED study protocol was approved by ethical review boards at each field site and the Johns Hopkins Institutional Review Board, Baltimore, USA. This analysis utilized publicly available data from both studies, see Data Availability statement, and as such is non-human subjects research.
Senior and Reviewing Editor
- Eduardo Franco, McGill University, Canada
Reviewer
- Andrew N Mertens, University of California, Berkeley, United States
Publication history
- Received: March 9, 2022
- Preprint posted: March 10, 2022 (view preprint)
- Accepted: December 29, 2022
- Accepted Manuscript published: January 6, 2023 (version 1)
- Version of Record published: January 11, 2023 (version 2)
Copyright
© 2023, Ahmed et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 300
- Page views
-
- 35
- Downloads
-
- 0
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Epidemiology and Global Health
Background:
Short-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here, we report on the performance of ensembles in predicting COVID-19 cases and deaths across Europe between 08 March 2021 and 07 March 2022.
Methods:
We used open-source tools to develop a public European COVID-19 Forecast Hub. We invited groups globally to contribute weekly forecasts for COVID-19 cases and deaths reported by a standardised source for 32 countries over the next 1–4 weeks. Teams submitted forecasts from March 2021 using standardised quantiles of the predictive distribution. Each week we created an ensemble forecast, where each predictive quantile was calculated as the equally-weighted average (initially the mean and then from 26th July the median) of all individual models’ predictive quantiles. We measured the performance of each model using the relative Weighted Interval Score (WIS), comparing models’ forecast accuracy relative to all other models. We retrospectively explored alternative methods for ensemble forecasts, including weighted averages based on models’ past predictive performance.
Results:
Over 52 weeks, we collected forecasts from 48 unique models. We evaluated 29 models’ forecast scores in comparison to the ensemble model. We found a weekly ensemble had a consistently strong performance across countries over time. Across all horizons and locations, the ensemble performed better on relative WIS than 83% of participating models’ forecasts of incident cases (with a total N=886 predictions from 23 unique models), and 91% of participating models’ forecasts of deaths (N=763 predictions from 20 models). Across a 1–4 week time horizon, ensemble performance declined with longer forecast periods when forecasting cases, but remained stable over 4 weeks for incident death forecasts. In every forecast across 32 countries, the ensemble outperformed most contributing models when forecasting either cases or deaths, frequently outperforming all of its individual component models. Among several choices of ensemble methods we found that the most influential and best choice was to use a median average of models instead of using the mean, regardless of methods of weighting component forecast models.
Conclusions:
Our results support the use of combining forecasts from individual models into an ensemble in order to improve predictive performance across epidemiological targets and populations during infectious disease epidemics. Our findings further suggest that median ensemble methods yield better predictive performance more than ones based on means. Our findings also highlight that forecast consumers should place more weight on incident death forecasts than incident case forecasts at forecast horizons greater than 2 weeks.
Funding:
AA, BH, BL, LWa, MMa, PP, SV funded by National Institutes of Health (NIH) Grant 1R01GM109718, NSF BIG DATA Grant IIS-1633028, NSF Grant No.: OAC-1916805, NSF Expeditions in Computing Grant CCF-1918656, CCF-1917819, NSF RAPID CNS-2028004, NSF RAPID OAC-2027541, US Centers for Disease Control and Prevention 75D30119C05935, a grant from Google, University of Virginia Strategic Investment Fund award number SIF160, Defense Threat Reduction Agency (DTRA) under Contract No. HDTRA1-19-D-0007, and respectively Virginia Dept of Health Grant VDH-21-501-0141, VDH-21-501-0143, VDH-21-501-0147, VDH-21-501-0145, VDH-21-501-0146, VDH-21-501-0142, VDH-21-501-0148. AF, AMa, GL funded by SMIGE - Modelli statistici inferenziali per governare l'epidemia, FISR 2020-Covid-19 I Fase, FISR2020IP-00156, Codice Progetto: PRJ-0695. AM, BK, FD, FR, JK, JN, JZ, KN, MG, MR, MS, RB funded by Ministry of Science and Higher Education of Poland with grant 28/WFSN/2021 to the University of Warsaw. BRe, CPe, JLAz funded by Ministerio de Sanidad/ISCIII. BT, PG funded by PERISCOPE European H2020 project, contract number 101016233. CP, DL, EA, MC, SA funded by European Commission - Directorate-General for Communications Networks, Content and Technology through the contract LC-01485746, and Ministerio de Ciencia, Innovacion y Universidades and FEDER, with the project PGC2018-095456-B-I00. DE., MGu funded by Spanish Ministry of Health / REACT-UE (FEDER). DO, GF, IMi, LC funded by Laboratory Directed Research and Development program of Los Alamos National Laboratory (LANL) under project number 20200700ER. DS, ELR, GG, NGR, NW, YW funded by National Institutes of General Medical Sciences (R35GM119582; the content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS or the National Institutes of Health). FB, FP funded by InPresa, Lombardy Region, Italy. HG, KS funded by European Centre for Disease Prevention and Control. IV funded by Agencia de Qualitat i Avaluacio Sanitaries de Catalunya (AQuAS) through contract 2021-021OE. JDe, SMo, VP funded by Netzwerk Universitatsmedizin (NUM) project egePan (01KX2021). JPB, SH, TH funded by Federal Ministry of Education and Research (BMBF; grant 05M18SIA). KH, MSc, YKh funded by Project SaxoCOV, funded by the German Free State of Saxony. Presentation of data, model results and simulations also funded by the NFDI4Health Task Force COVID-19 (https://www.nfdi4health.de/task-force-covid-19-2) within the framework of a DFG-project (LO-342/17-1). LP, VE funded by Mathematical and Statistical modelling project (MUNI/A/1615/2020), Online platform for real-time monitoring, analysis and management of epidemic situations (MUNI/11/02202001/2020); VE also supported by RECETOX research infrastructure (Ministry of Education, Youth and Sports of the Czech Republic: LM2018121), the CETOCOEN EXCELLENCE (CZ.02.1.01/0.0/0.0/17-043/0009632), RECETOX RI project (CZ.02.1.01/0.0/0.0/16-013/0001761). NIB funded by Health Protection Research Unit (grant code NIHR200908). SAb, SF funded by Wellcome Trust (210758/Z/18/Z).
-
- Epidemiology and Global Health
Background:
Affectionate touch, which is vital for mental and physical health, was restricted during the Covid-19 pandemic. This study investigated the association between momentary affectionate touch and subjective well-being, as well as salivary oxytocin and cortisol in everyday life during the pandemic.
Methods:
In the first step, we measured anxiety and depression symptoms, loneliness and attitudes toward social touch in a large cross-sectional online survey (N = 1050). From this sample, N = 247 participants completed ecological momentary assessments over 2 days with six daily assessments by answering smartphone-based questions on affectionate touch and momentary mental state, and providing concomitant saliva samples for cortisol and oxytocin assessment.
Results:
Multilevel models showed that on a within-person level, affectionate touch was associated with decreased self-reported anxiety, general burden, stress, and increased oxytocin levels. On a between-person level, affectionate touch was associated with decreased cortisol levels and higher happiness. Moreover, individuals with a positive attitude toward social touch experiencing loneliness reported more mental health problems.
Conclusions:
Our results suggest that affectionate touch is linked to higher endogenous oxytocin in times of pandemic and lockdown and might buffer stress on a subjective and hormonal level. These findings might have implications for preventing mental burden during social contact restrictions.
Funding:
The study was funded by the German Research Foundation, the German Psychological Society, and German Academic Exchange Service.