Abstract
Given a lifetime risk of ~90% by the ninth decade of life, it is unknown if there are true controls for hypertension in epidemiological and genetic studies. Here, we compared Bayesian logistic and timetoevent approaches to modeling hypertension. The median age at hypertension was approximately a decade earlier in African Americans than in European Americans or Mexican Americans. The probability of being free of hypertension at 85 years of age in African Americans was less than half that in European Americans or Mexican Americans. In all groups, baseline hazard rates increased until nearly 60 years of age and then decreased but did not reach zero. Taken together, modeling of the baseline hazard function of hypertension suggests that there are no true controls and that controls in logistic regression are cases with a late age of onset.
Introduction
Hypertension, or abnormally high blood pressure, is common in the US, affecting approximately 45% of adults (Centers for Disease Control and Prevention, National Center for Health Statistics, 2018). Hypertension is a risk factor for heart disease and stroke and causes or contributes to nearly half a million deaths a year (Centers for Disease Control and Prevention, 2019). Globally, an estimated 1.13 billion people have hypertension, and less than 20% of these people have their blood pressure under control (World Health Organization, 2019).
Systolic blood pressure (SBP) has a general tendency to increase linearly with age, across sexes and ethnic groups (Burt et al., 1995). Diastolic blood pressure (DBP) has a general tendency to increase until the end of the fifth decade of life, after which DBP either stabilizes or decreases, again across sexes and ethnic groups (Burt et al., 1995). In the Framingham Heart Study, an individual who is normotensive at 55–65 years of age has an 80–90% residual lifetime risk of developing hypertension, adjusted for competing causes of mortality (Vasan et al., 2002). Compared to agematched European Americans, hypertension in African Americans develops at an earlier age and is more prevalent (Chobanian et al., 2003; Cooper et al., 1996; Mozaffarian et al., 2016).
A common approach in genetic epidemiology studies of hypertension involves coding the outcome as a binary variable representing cases and controls and proceeds with logistic regression. Given that the lifetime risk is so high, we first investigated whether a proportional hazards model in timetoevent analysis yields a better fit than logistic regression. Second, as timetoevent analysis assumes that the event will occur, that is, that every individual will become hypertensive if they live long enough, we investigated a proportional hazards model including a fraction of individuals that will never become hypertensive and hence are true epidemiological controls. Third, using an agnostic model screening approach, we explored the issue of what covariates to include and how much variance they explain. We performed these analyses in an observational study of African Americans and then replicated and extended our findings in a nationally representative study of African Americans, European Americans, and Mexican Americans.
Results
Timetoevent analysis
Across African Americans, European Americans, and Mexican Americans, median systolic blood pressure increased with age, whereas median diastolic blood pressure increased and then decreased (Figure 1). In timetoevent analysis, the probability of not having hypertension decreased across the entire age range (Figure 2). However, in all three groups, there was an inflection point in middle age after which the probability of having hypertension increased at a slower rate (Figure 2). Despite this slowdown, the probability of not having hypertension was not zero, even by the middle of the ninth decade (Figure 2). In the discovery study (HUFS), by 85 years of age, 12.2% (95% credible interval (CI) [5.1%, 20.6%]) of African Americans remained free of hypertension (Figure 2). In the replication study (NHANES), by 85 years of age, 8.4% (95% CI [5.4%, 11.6%]) of African Americans, 21.4% (95% CI [18.1%, 24.6%]) of European Americans, and 20.6% (95% CI [13.6%, 27.3%]) of Mexican Americans remained free of hypertension (Figure 2). The median age at which hypertension occurred was 48 (95% CI [45, 50]) years for African Americans in HUFS and 42 (95% CI [40, 44]) years for African Americans, 57 (95% CI [55, 59]) years for European Americans, and 56 (95% CI [54, 58]) years for Mexican Americans in NHANES (Figure 2).
We next investigated seven distributions, six parametric, and one nonparametric, for the baseline hazard function in a proportional hazards model. Generalized gamma, loglogistic, and exponential distributions all yielded poor fits to the HUFS data (Figure 3). Lognormal, gamma, and Weibull distributions all yielded good fits to the HUFS data (Figure 3), with the lognormal distribution yielding the highest likelihood of the six parametric distributions (Table 1). Overall, a nonparametric hazard function yielded the highest likelihood (Table 1). However, we could not perform model comparison between the lognormal distribution of the baseline hazard function and the nonparametric hazard function in the frequentist framework because it is unclear how to measure the dimensionality of the nonparametric hazard function.
Based on a Bayesian model in which the hazard rates were gamma distributed with a correlated prior process, we estimated the underlying hazard rates. Across African Americans, European Americans, and Mexican Americans, hazard rates increased and then decreased (Figure 4). The initial hazard rate for African Americans (0.0123 (95% highest posterior density interval [0.0091, 0.0161])) was larger than the initial hazard rate for both European Americans (0.0028 (95% highest posterior density interval [0.0013, 0.0044])) and Mexican Americans (0.0036 (95% highest posterior density interval [0.0019, 0.0052])). The maximum hazard rate for African Americans (0.0630 (95% highest posterior density interval [0.0341, 0.0940])) was trending larger than the maximum hazard rate for European Americans (0.0368 (95% highest posterior density interval [0.0200, 0.0546])) and Mexican Americans (0.0417 (95% highest posterior density interval [0.0219, 0.0635])). The age of maximum hazard was 55 (95% CI [43, 72]) years in HUFS and 58 (95% CI [48, 67]) years in African Americans, 60 (95% CI [52, 70]) years in European Americans, and 58 (95% CI [48, 69]) years in Mexican Americans in NHANES (Figure 4).
Proportional hazards vs. logistic models
Using the HUFS data, we found that logistic regression yielded a better fit than the proportional hazards model based on the DIC, provided that both age and age^{2} were included as covariates in the logistic regression model (Table 2). Using logistic regression, the addition of age to the reduced (interceptonly) model resulted in a substantially lower DIC (Table 2), with a linear effect of age explaining 27.8% of the variance at the cost of one additional parameter. The addition of age^{2} further decreased the DIC (Table 2), explaining an additional 0.8% of the variance at the cost of one additional parameter. With smoothing, the effective dimensionality of the proportional hazards model was 3.3, comparable to the dimensionality of 3.0 for the logistic model adjusted for age and age^{2} (Table 2). We also found that inclusion of a permanent stayer fraction increased the DIC of the proportional hazards model, indicating that inclusion of a permanent stayer fraction was not supported (Table 2).
Model selection
We used forwardbackward regression to perform model selection on a set of 38 potential covariates. In the HUFS data set, the final model included six of these covariates: chloride, insulin, lowdensity lipoprotein cholesterol, potassium, uric acid, and weight. Compared to the logistic model with age and age^{2}, these six covariates improved the fit (p=2.01 × 10^{−15}) and explained an additional 8.1% of the variance, for a total of 36.7% of variance explained. Each of these six covariates were replicated (pvalues from 2.81 × 10^{−2} to 1.83 × 10^{−12}) and directionally consistent in NHANES African Americans (Table 3). Furthermore, chloride, lowdensity lipoprotein cholesterol, potassium, uric acid, and weight, but not insulin, were significant covariates in NHANES European Americans, whereas chloride, insulin, lowdensity lipoprotein cholesterol, uric acid, and weight, but not potassium, were significant covariates in NHANES Mexican Americans (Table 4). The borderline nonsignificance of potassium in NHANES Mexican Americans reflected a smaller sample variance, a smaller effect size estimate, and a larger standard error for the effect size, which could be compensated for by increasing the sample size by 20%. In NHANES, the variance explained by age, age^{2}, chloride, lowdensity lipoprotein cholesterol, uric acid, and weight was 40.9% in African Americans, 34.8% in European Americans, and 28.3% in Mexican Americans.
We further investigated two unexpected results of the model selection. One, sex was not selected in the final model. Of the selected covariates, three showed sex dimorphisms in HUFS by Welch’s ttest, with uric acid and weight higher in males and lowdensity lipoprotein cholesterol higher in females (Table 5). Second, across African Americans, European Americans, and Mexican Americans, increasing lowdensity lipoprotein cholesterol was associated with decreased risk of hypertension (Tables 3 and 4). As the direction of this effect was unexpected, we reanalyzed the HUFS African Americans accounting for lipids medications. Selfreported use of any lipid medication (coded as yes/no) was associated with increased risk of hypertension (p=1.25 × 10^{−3}) but increasing lowdensity lipoprotein cholesterol remained associated with decreased risk of hypertension (p=4.46 × 10^{−2}). Of the individuals in this subanalysis, 8.3% reported use of any lipid medication, all instances of which involved statins, so drug class was not a confounder.
To investigate the possibility of timedependent covariates, we added interaction terms to the full logistic model (with age, age^{2} and the selected covariates). For each of the selected covariates, the Akaike information criterion was larger for the model containing a term that interacted with age (Table 6). Thus, the evidence does not support timedependence for any of the selected covariates.
2017 revised classification of hypertension
We reanalyzed the HUFS data based on the 2017 reclassification of hypertension as SBP ≥130 mm Hg or DBP ≥80 mm Hg (Whelton et al., 2018). Under these more stringent thresholds, the prevalence of hypertension increased from 48.3% to 66.1%, the median time to hypertension decreased to 36 years (95% CI [33, 39]), the age at maximum hazard increased to 57 (95% CI [33, 72]) years, and the lifetime risk at 85 years of age increased to 95.8% (95% CI [90.3%, 99.2%]). The 95% highest posterior density interval of the permanent stayer fraction included zero.
Discussion
Timetoevent analysis of hypertension with a primary focus on modeling the baseline hazard function recapitulated three known health disparities. One, at the earliest ages, the baseline hazard rate was higher in African Americans compared to European Americans and Mexican Americans. Two, the median age when hypertension occurred was approximately a decade earlier for African Americans compared to European Americans and Mexican Americans. Three, by the middle of the ninth decade of life, the probability of African Americans remaining free of hypertension was less than half the probability for European Americans and Mexican Americans.
The Bayesian model of the hazard rates revealed an inflection point, consistent with the lognormal parametric distribution yielding the best fit among the parametric distributions we explored. A decrease in hazard rates implies that the number of events decreases or that the number of individuals at risk increases. We suggest that the simplest explanation for the inflection point in hazard rates is the change in trajectory of DBP, with the decrease in DBP leading to a decreased hazard of hypertension and consequently a reduced number of events. We cannot rule out the possibility that there are individuals who are at less risk, although the finding that the hazard rate does not reach zero (by 85 years of age) indicates the continued presence of risk. It is also possible that mortality due to comorbidities begins to subside by around 60 years of age, leading to reduced hazard rates of hypertension in older ages.
To enable model comparison between logistic regression and proportional hazards analysis, we used the DIC, which is based on an estimate of dimensionality. The existence of the inflection point in the baseline hazard function implies an approximately quadratic effect of age, which in logistic regression is captured by an age^{2} term with a negative regression coefficient. Our results indicated that the proportional hazards model we used, which incorporated smoothing to prevent overparameterization, and logistic regression with age and age^{2}, provide comparable fits. Therefore, we recommend that studies of hypertension using a casecontrol design should always include a logistic regression adjustment for both linear and quadratic effects of age.
In addition to nonzero hazard rates, evidence for the lack of existence of true controls comes from the modeling of a permanent stayer fraction. We found that the permanent stayer fraction essentially corresponded to the fraction of individuals who had not yet become hypertensive at the last observed age. As the permanent stayer fraction did not capture any additional information, models including the permanent stayer fraction had a worse fit compared to models without that additional parameter. A major implication for proportional hazards analysis is that the assumption that the event will occur for every individual at some point in time, unless the individual dies first, is valid. A major implication for logistic regression is that controls should be interpreted as individuals that have not yet become hypertensive, rather than as individuals who will not become hypertensive.
Traditional risk factors for hypertension include excess body weight, excess dietary sodium intake, reduced physical activity, deficiency of potassium, and excess alcohol intake (Chobanian et al., 2003). Our results have three implications regarding these risk factors. One, we confirmed that excess weight and low potassium, but not excess sodium, were risk factors in our multiple regression model. It is possible that the inclusion of chloride could account for the effects of sodium. We found that chloride was negatively associated with hypertension. Although high dietary intake of chloride is a risk factor for hypertension, lower serum chloride levels have been associated with higher risks of hypertension and cardiovascular disease and higher allcause and causespecific mortality among hypertensives (McCallum et al., 2013; Taylor et al., 2007). Two, longitudinal analysis of 30 years of followup in the Framingham Heart Study showed that the incidence rate of hypertension increased faster in females than in males, with females having lower incidence under 50 years of age and higher incidence over 50 years of age (Dannenberg et al., 1988). In contrast, we found that sex was not a significant covariate, although lowdensity lipoprotein cholesterol, uric acid, and weight showed sex dimorphisms. Three, we did not have data on birth weight, but our findings regarding hazard rates at early ages are consistent with low birth weight being a risk factor for hypertension (Lackland et al., 2003) and African Americans having lower birth weight than European Americans (David and Collins, 1997).
Lowdensity lipoprotein cholesterol has been reported to be positively associated with hypertension, but this association generally does not remain significant after covariate adjustment (Haffner et al., 1992; Laaksonen et al., 2008; Otsuka et al., 2016; Sesso et al., 2005; Wildman et al., 2004). In contrast, we found that lowdensity lipoprotein cholesterol was negatively associated with hypertension, across African Americans, European Americans, and Mexican Americans. The explanation for this discrepancy is unclear, but we present evidence against three possibilities. We obtained a negative association in both single and multiple regression models, suggesting that the opposite direction of effect was not due to other (known) covariates. We also found that the opposite direction of effect was not due to a timedependent effect. Furthermore, the opposite direction of effect was not due to confounding by selfreported use of medication (i.e. statins). Studies of the relationship between lowdensity lipoprotein cholesterol and healthy aging have led to conflicting conclusions, with some studies reporting a negative association (Barzilai et al., 2001; Postmus et al., 2015) and other studies reporting a positive association (Lv et al., 2015; Lv et al., 2019; Rantanen et al., 2015). One possible explanation for this discrepancy is survival bias, if individuals with higher lowdensity lipoprotein cholesterol disproportionately experience mortality prior to the onset of hypertension. Given that both data sets in our study were observational, we lack followup data to model allcause or causespecific mortality as a competing risk.
Uric acid is associated with hypertension, but whether this association is causal remains unestablished. Mendelian randomization (MR) studies have provided conflicting evidence regarding the causality of uric acid for hypertension, with evidence for no effect (Palmer et al., 2013), protection (Sedaghat et al., 2014), and risk (Parsa et al., 2012). MR studies have reported that uric acid is not causal for adiposity, chronic kidney disease, triglycerides, type 2 diabetes, or obesity (Jordan et al., 2019; Kleber et al., 2015; Lyngdoh et al., 2012; Rasheed et al., 2014). In contrast, higher adiposity, higher body mass index, lower highdensity lipoprotein cholesterol, and higher triglycerides are causally associated with increased uric acid (Lyngdoh et al., 2012; Palmer et al., 2013; Rasheed et al., 2014; Yu et al., 2019). The association of uric acid with hypertension may reflect pleiotropy through linked metabolic pathways, perhaps those involving lipid metabolism (Li et al., 2019).
The African Americans in HUFS were all recruited and enrolled in Washington, D.C. The fact that neither the last completed grade of education nor income were significant predictors during model selection may reflect homogeneity of study participants within one city. In contrast, the African Americans in NHANES were recruited nationwide. All covariates that were significant during model selection in HUFS replicated in NHANES African Americans, indicating that the findings in HUFS were generalizable to the national level.
In summary, by Bayesian modeling of the baseline hazard function in timetoevent analysis of hypertension, we found that logistic regression, if adjusted for both linear and quadratic effects of age, yielded a fit comparable to proportional hazards regression. We found no evidence to support the existence of true controls, suggesting that if an individual lives long enough, hypertension is inevitable. Finally, we found that the combination of chloride, lowdensity lipoprotein cholesterol, uric acid, and weight, in addition to age and age^{2}, accounted for 40.9% of the variance of hypertension in African Americans, 34.8% in European Americans, and 28.3% in Mexican Americans, simultaneously consistent with common risk factors among the groups and heterogeneity across the groups.
Materials and methods
Discovery study
Request a detailed protocolThe Howard University Family Study (HUFS) is a populationbased observational study of African American families and unrelated individuals from Washington, D.C. (Adeyemo et al., 2009). Ethics approval for the Howard University Family Study (HUFS) was obtained from the Howard University Institutional Review Board (protocol number IRB06GSAS32A) and written informed consent was obtained from each participant. All clinical investigation was conducted according to the principles expressed in the Declaration of Helsinki. Families and individuals were not ascertained based on any phenotype. Weight was measured on an electronic scale to the nearest 0.1 kg. Height was measured on a stadiometer to the nearest 0.1 cm. Body mass index (BMI) was calculated as weight divided by the square of height (kg/m^{2}). Waist circumference was measured to the nearest 0.1 cm at the narrowest part of the torso. Hip circumference was measured to the nearest 0.1 cm at the widest part of the buttocks. The waisthip ratio was calculated as waist circumference in cm divided by hip circumference in cm. Fat mass and fatfree mass were estimated using bioelectrical impedance analysis with a validated populationspecific equation as previously described (Luke et al., 1997). Percent fat mass was defined as fat mass divided by weight ×100. Blood pressure was measured while seated using an oscillometric device (Omron Healthcare, Inc, Bannockburn, Illinois). Three readings were taken at 10 min intervals. Reported readings were the averages of the second and third readings. Hypertension was defined as SBP ≥140 mm Hg, DBP ≥90 mm Hg, or treatment with antihypertensive medication. Blood was drawn after an overnight fast of at least 8 hr and all collected samples were stored at −80°C pending biochemical assay. Creatinine, total cholesterol, highdensity lipoprotein cholesterol, lowdensity lipoprotein cholesterol, triglycerides, fructosamine, glucose, alkaline phosphatase, alanine aminotransferase, total bilirubin, sodium, potassium, chloride, calcium, uric acid, urea, Creactive protein, albumin, bicarbonate, and total protein were measured using COBAS INTEGRA tests (Roche Diagnostics, Indianapolis, Indiana). Cortisol and insulin were measured using Elecsys assays (Roche Diagnostics). Creatinine clearance was calculated using the CockcroftGault equation and the estimated glomerular filtration rate (eGFR) was calculated using the fourvariable Modification of Diet in Renal Disease Study equation (National Kidney Foundation, 2002; Levey et al., 1999). T2D case status was defined as fasting plasma glucose level ≥126 mg/dL or treatment with antidiabetic medication. T2D control status was defined as fasting plasma glucose ≤100 mg/dL and no treatment with antidiabetic medication. The last completed grade of education and income were selfreported on a questionnaire. The proportions of African and European ancestry were estimated as described previously (Shriner et al., 2011). We extracted a subset of 1014 unrelated individuals.
Bayesian logistic regression and timetoevent analysis
Request a detailed protocolLet $T$ represent the time of an event and $S\left(t\right)=\mathrm{Pr}\left(T>t\right)$ represent the survival function, that is, the probability of being eventfree as a function of observation time $t$. With crosssectional data, there is a single observation for each individual. If the individual has not yet experienced the event, then the event time is right censored, because the event is presumed to occur some unknown time after observation. If the individual has already experienced the event, then the event time is left censored, because the event occurred at some unknown time prior to observation. The combination of left and right censored data is known as interval censored data. We performed interval censored proportional hazards analysis using the R package icenReg.
To perform Bayesian modeling, we used WinBUGS, version 1.4 with the R package R2WinBUGS. For the i^{th} individual, we modeled logistic regression as:
In this reduced model, the prior distribution for the intercept $\alpha $ follows a diffuse normal distribution with mean 0 and variance 10^{6}. We ran three chains of 10,000 iterations, with a burnin of 2500 iterations and thinning of 10, yielding a posterior sample based on 2250 iterations. We assessed convergence using the potential scale reduction factor Rhat, which should equal 1 at convergence for all monitored parameters. We assessed model fit using the deviance information criterion (DIC). We then added age to the reduced model and ran three chains of 10,000 iterations, with a burnin of 2500 iterations and thinning of 10, yielding a posterior sample of 2250 iterations. Next, we added age and age^{2} to the reduced model and ran three chains of 100,000 iterations, with a burnin of 10,000 and thinning of 50, yielding a posterior sample of 5400 iterations. In all instances, effect sizes followed a diffuse normal prior distribution with mean 0 and variance 10^{6}.
We performed timetoevent analysis using a proportional hazards model (Congdon, 2003). The hazard function $h\left(t\right)$ defines the instantaneous risk of the event at time $t$, conditional on being eventfree at that time:
Given the hazard function $h\left(t\right)$, the cumulative hazard function $H\left(t\right)$ and the survival function $S\left(t\right)$ are defined as follows:
In Cox’s proportional hazards model (Cox, 1972), the hazard function $\lambda \left(tz\right)$ is given by ${\lambda}_{0}\left(t\right){e}^{\beta z}$, in which ${\lambda}_{0}\left(t\right)$ is the baseline hazard function and $z$ is a covariate with coefficient β. In the absence of covariates, the hazard function is equivalent to the baseline hazard function, $\lambda \left(t\right)={\lambda}_{0}\left(t\right)$. We modeled the hazard rates as gammadistributed with a correlated prior process (Arjas and Gasbarra, 1994; Congdon, 2006). Specifically, the hazard rates were:
In this model, the expected value of ${\lambda}_{0}\left(t+1\right)$ equals ${\lambda}_{0}\left(t\right)$ and hence the hazard function is a martingale. We divided time into intervals of one year and assumed that the baseline hazard was constant within intervals.
A major assumption of timetoevent analysis is that the event will occur for every individual at some point in time, although the individual may die before the event would have occurred. This assumption can be relaxed by incorporating a permanent stayer or cured fraction, that is, a fraction of individuals who will never experience the event. In our context, this fraction represents individuals who never develop hypertension and hence are true epidemiological controls. Let $\pi $ represent the permanent stayer fraction. Then, the survival function for the entire population ${S}_{p}\left(t\right)$ is the twocomponent mixture model ${S}_{p}\left(t\right)=\pi +\left(1\pi \right)S\left(t\right)$ (Gu et al., 2011). We assigned to $\pi $ the prior distribution $\text{Uniform}\left(0,1\right)$.
Model selection
Request a detailed protocolWe tested 38 covariates for inclusion in the model: sex; weight, height, hip circumference, waist circumference, waisthip ratio, body mass index; fat mass, fatfree mass, percent fat mass; type 2 diabetes status, fasting glucose, fasting insulin, fructosamine; triglycerides, highdensity lipoprotein cholesterol, lowdensity lipoprotein cholesterol, total cholesterol; creatinine, creatinine clearance, estimated glomerular filtration rate; alkaline phosphatase, alanine aminotransferase, total bilirubin; last grade of education completed, income; percent African ancestry; calcium, chloride, potassium, sodium; albumin, carbon dioxide, Creactive protein, total protein, uric acid, urea, and cortisol. We used a fourstep forwardbackward regression procedure to perform model selection. First, we fit a single regression model for each covariate. Second, we fit a multiple regression model with all significant predictors from step 1 and used backward selection to remove nonsignificant predictors. Third, starting with the final model from step 2, we reconsidered each nonsignificant covariate from step 1 using forward selection. Fourth, we performed a final pruning step on the final model from step 3. For every test, we declared a significance level of 0.05. Pseudor^{2} values were estimated using the formula ${r}^{2}=\frac{1\mathrm{exp}\left(\frac{{D}_{1}{D}_{0}}{n}\right)}{1\mathrm{exp}\left(\frac{{D}_{0}}{n}\right)}$, in which ${D}_{1}$ is the deviance, ${D}_{0}$ is the null deviance, and $n$ is the sample size (Nagelkerke, 1991). Finally, we added age, age^{2}, and selected covariates to the Bayesian logistic regression model described above and ran three chains of 1,000,000 iterations, with a burnin of 100,000 iterations and thinning of 1,000, yielding a posterior sample of 2700 iterations. Effect sizes for all covariates followed a diffuse normal prior distribution with mean 0 and variance 10^{6}.
Replication study
Request a detailed protocolThe National Center for Health Statistics of the US Centers for Disease Control and Prevention conducts the ongoing National Health and Nutrition Examination Survey (NHANES). The survey comprises an inhome interview and a clinical examination by a mobile examination center. We retrieved 16 years of examination data (from 1999 to 2014) from the CDC portal (http://wwwn.cdc.gov/Nchs/Nhanes). We downloaded the variables BMXWT, BPQ020, BPQ040A, BPXDI1, BPXDI2, BPXDI3, BPXDI4, BPXSY1, BPXSY2, BPXSY3, BPXSY4, LBDLDL, LBXIN, LBXSCLSI, LBXSKSI, LBXSUA, RIDAGEEX, RIAGENDR, RIDRETH1, SDVMVPSU, SDMVSTRA, and WTSAF2YR. SBP was defined as the average of BPXSY1, BPXSY2, BPXSY3, and BPXSY4. DBP was defined as the average of BPXDI1, BPXDI2, BPXDI3, and BPXDI4. Hypertension was defined as SBP ≥ 140 mm Hg, DBP ≥ 90 mm Hg, treatment with antihypertensive medication, or having ever been diagnosed by a doctor. Strata, clusters, and weights were designed to make statistical estimates representative of the noninstitutionalized, civilian US population. We included the strata variable SDMVSTRA as a factor with 118 levels. The factor SDMVPSU defined clusters nested within strata. There were up to three levels of cluster within each stratum, yielding a total of 241 combinations of stratum and cluster. Individuals with fasting samples represented less than half of the individuals assessed by interview or the mobile examination center, such that some clusters and some strata were empty or sparse. Consequently, we omitted the cluster factor. To account for eight survey cycles, we multiplied the weights WTSAF2YR equally by 1/8. For each value of RIDRETH, we rescaled the weights by dividing by the mean. For the survey cycle 2013–2014, we recalibrated insulin to account for changes in the protocol: $\mathrm{I}\mathrm{n}\mathrm{s}\mathrm{u}\mathrm{l}\mathrm{i}{\mathrm{n}}_{20112012}={10}^{({0.9765}^{\ast}{\mathrm{log}}_{10}(\mathrm{I}\mathrm{n}\mathrm{s}\mathrm{u}\mathrm{l}\mathrm{i}{\mathrm{n}}_{20132014}+0.07832))}$ (https://wwwn.cdc.gov/Nchs/Nhanes/20132014/INS_H.htm). Across the eight survey cycles, we retrieved data for a total of 23,628 participants, including 5146 African Americans (‘nonHispanic Blacks’), 10,023 European Americans (‘nonHispanic Whites’), and 5059 Mexican Americans.
Code availability
Request a detailed protocolWinBUGS code is available at https://github.com/dshriner/Timetoevent (Shriner, 2020).
Data availability
Raw source data files have been deidentified and made available for both discovery and replication data sets.
References

Nonparametric Bayesian inference from right censored survival data, using the Gibbs samplerStatistica Sinica 4:505–524.

Offspring of centenarians have a favorable lipid profileJournal of the American Geriatrics Society 49:76–79.https://doi.org/10.1046/j.15325415.2001.49013.x

ReportHypertension Cascade: Hypertension Prevalence, Treatment and Control Estimates Among US Adults Aged 18 Years and Older Applying the Criteria From the American College of Cardiology and American Heart Association's 2017 Hypertension GuidelineNHANES 20132016Centers for Disease Control and Prevention.

Regression models and lifetablesJournal of the Royal Statistical Society: Series B 34:187–202.

Incidence of hypertension in the Framingham StudyAmerican Journal of Public Health 78:676–679.https://doi.org/10.2105/AJPH.78.6.676

Differing birth weight among infants of U.S.born blacks, Africanborn blacks, and U.S.born whitesNew England Journal of Medicine 337:1209–1214.https://doi.org/10.1056/NEJM199710233371706

Analysis of cure rate survival data under proportional odds modelLifetime Data Analysis 17:123–134.https://doi.org/10.1007/s109850109171z

Uric acid and cardiovascular events: a Mendelian randomization studyJournal of the American Society of Nephrology 26:2831–2838.https://doi.org/10.1681/ASN.2014070660

Dyslipidaemia as a predictor of hypertension in middleaged menEuropean Heart Journal 29:2561–2568.https://doi.org/10.1093/eurheartj/ehn061

Low birth weight as a risk factor for hypertensionThe Journal of Clinical Hypertension 5:133–136.https://doi.org/10.1111/j.15246175.2003.01353.x

Relation between body mass index and body fat in black population samples from Nigeria, Jamaica, and the United StatesAmerican Journal of Epidemiology 145:620–628.https://doi.org/10.1093/oxfordjournals.aje.a009159

K/DOQI clinical practice guidelines for chronic kidney disease: evaluation, classification, and stratificationAmerican Journal of Kidney Diseases 39:S1–S266.

Dyslipidemia and the risk of developing hypertension in a workingage male populationJournal of the American Heart Association 5:e003053.https://doi.org/10.1161/JAHA.115.003053

Genotypebased changes in serum uric acid affect blood pressureKidney International 81:502–507.https://doi.org/10.1038/ki.2011.414

LDL cholesterol still a problem in old age? A Mendelian randomization studyInternational Journal of Epidemiology 44:604–612.https://doi.org/10.1093/ije/dyv031

Clinical and laboratory characteristics of active and healthy aging (AHA) in octogenarian menAging Clinical and Experimental Research 27:581–587.https://doi.org/10.1007/s4052001503290

Mendelian randomization provides no evidence for a causal role of serum urate in increasing serum triglyceride levelsCirculation: Cardiovascular Genetics 7:830–837.https://doi.org/10.1161/CIRCGENETICS.114.000556

A prospective study of plasma lipid levels and hypertension in womenArchives of Internal Medicine 165:2420–2427.https://doi.org/10.1001/archinte.165.20.2420

Joint ancestry and association testing in admixed individualsPLOS Computational Biology 7:e1002325.https://doi.org/10.1371/journal.pcbi.1002325

2017 ACC/AHA/AAPA/ABC/ACPM/AGS/APhA/ASH/ASPC/NMA/PCNA guideline for the prevention, detection, evaluation, and management of high blood pressure in adults: a report of the American College of Cardiology/American Heart Association task force on clinical practice guidelinesHypertension 71:e13–e115.https://doi.org/10.1161/HYP.0000000000000065

Lipoprotein levels are associated with incident hypertension in older adultsJournal of the American Geriatrics Society 52:916–921.https://doi.org/10.1111/j.15325415.2004.52258.x
Decision letter

Mone ZaidiReviewing Editor; Icahn School of Medicine at Mount Sinai, United States

Matthias BartonSenior Editor; University of Zurich, Switzerland

Zané LombardReviewer; National Health Laboratory Service & University of the Witwatersrand, South Africa

BertJan van den BornReviewer
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
You have compared Bayesian logistic and timetoevent approaches to modeling hypertension in African Americans, European Americans and Mexican Americans. Your modeling of the baseline hazard function of hypertension suggests that there are no true controls and that controls in logistic regression are cases with a late age of onset. The studies are significant to our understanding of the efficacy of antihypertensive agents, and in particular, when comparator studies with established agents are used to approve new compounds.
Decision letter after peer review:
Thank you for submitting your article "TimetoEvent Modeling of Hypertension Reveals the Nonexistence of True Controls" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Zané Lombard (Reviewer #1); BertJan van den Born (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
The authors describe a statistical modelling approach to determine whether true controls for hypertension (i.e. individuals who will not develop hypertension over their lifetime) exist. This is an important question, as control participants are a key component in study design for epidemiological and genetic studies, particularly those that follow a chronic course. The authors tested their model in the Howard University Family Study (HUFS) and replicated their findings in the National Health and Nutrition Examination Survey (NHANES). The authors conclude that there are no true (normotensive) controls as previously normotensive individuals will become hypertensive later in life. The manuscript is based on high quality analyses and is well written.
Essential revisions:
The following points need careful consideration and resolution.
1) The realization that many normotensive controls will later become hypertensive is not surprising giving the known sharp increase of hypertension prevalence with age from both crosssectional analyses and prospective followup data. From population studies, we already know that >50% of the population is hypertensive above age 50 years, and that above 80 years of age, prevalence rates are ~80%. Except for the interesting modeling approach, the question therefore remains what this study really adds. Please comment.
2) The authors have assessed 38 potential covariates, with 6 remaining in the final model including 3 variables that had a negative association with lifetime hypertension risk, including LDL cholesterol. The finding that increasing LDL cholesterol was associated with a decreased risk of hypertension (not known in existing literature) raises the question whether competing risks for mortality (individuals with higher LDL cholesterol die earlier and therefore have a lower life timerisk of hypertension) was sufficiently taken into account.
3) The authors do not explain the counterintuitive finding that lower chloride levels are associated with a higher risk of hypertension.
4) The models with the 6 covariates that remained is reported for European and Hispanic (Mexican) Americans, but not for African Americans. This would be of interest as African Americans are known to have a more slatsensitive type of hypertension and are genetically more different compared to European and Hispanic Americans (who have the common European background).
https://doi.org/10.7554/eLife.62998.sa1Author response
Essential revisions:
The following points need careful consideration and resolution.
1) The realization that many normotensive controls will later become hypertensive is not surprising giving the known sharp increase of hypertension prevalence with age from both crosssectional analyses and prospective followup data. From population studies, we already know that >50% of the population is hypertensive above age 50 years, and that above 80 years of age, prevalence rates are ~80%. Except for the interesting modeling approach, the question therefore remains what this study really adds. Please comment.
While we agree that it is known that the prevalence of hypertension increases with age, our first important contribution is to rigorously quantify the underlying baseline hazard function (Discussion, first paragraph). Besides successfully recapitulating known aspects of hypertension prevalence, risk, and disparities (see the aforementioned paragraph), explicit modeling of the baseline hazard function allowed for formal comparison between timetoevent analysis and logistic regression (Discussion, third paragraph). We found that the instantaneous hazard rate of hypertension reaches a maximum at around 60 years of age in all studied ethnic groups. Even though the instantaneous hazard rate tends to decrease after 60 years of age, “controls” in logistic regression should be interpreted as individuals that have not yet become hypertensive, rather than as individuals who will not become hypertensive, which impacts what has been termed healthy aging (Discussion, fourth paragraph). Also, we defined important covariates to include, how much variance they explain, and how the variance explained differs among ethnic groups (Discussion, fifth, sixth and seventh paragraphs). As noted in Comments 2 and 3 below, our findings specifically regarding lowdensity lipoprotein cholesterol and chloride may have been unexpected, although there is published evidence supporting our findings for both of these covariates. These contributions are summarized in the final paragraph of the Discussion.
2) The authors have assessed 38 potential covariates, with 6 remaining in the final model including 3 variables that had a negative association with lifetime hypertension risk, including LDL cholesterol. The finding that increasing LDL cholesterol was associated with a decreased risk of hypertension (not known in existing literature) raises the question whether competing risks for mortality (individuals with higher LDL cholesterol die earlier and therefore have a lower life timerisk of hypertension) was sufficiently taken into account.
We agree that it is possible that the observed association between LDL cholesterol and hypertension might be affected by survival bias (Discussion, second paragraph). Studies of the relationship between LDL cholesterol and healthy aging have led to conflicting conclusions, with some studies reporting a negative association and other studies reporting a positive association (Discussion, sixth paragraph). In our study, both data sets include a single observation per individual; we know with certainty that each observed individual was alive at the time of observation, but we have no information regarding time or cause of mortality. Therefore, we lack data with which to model either allcause or causespecific mortality as a competing risk. We acknowledge this limitation in the aforementioned paragraph.
3) The authors do not explain the counterintuitive finding that lower chloride levels are associated with a higher risk of hypertension.
Although there is evidence that higher dietary chloride intake is a risk factor for hypertension, lower serum chloride levels have been found to be associated with higher risks of cardiovascular disease and hypertension and higher allcause mortality among hypertensives (Discussion, fifth paragraph). Thus, our study is not the first to report this observation.
4) The models with the 6 covariates that remained is reported for European and Hispanic (Mexican) Americans, but not for African Americans. This would be of interest as African Americans are known to have a more slatsensitive type of hypertension and are genetically more different compared to European and Hispanic Americans (who have the common European background).
The models with the six selected covariates are reported for African Americans in Table 3 and for European and Mexican Americans in Table 4. Results for all six covariates are also included in Tables 5 and 6.
https://doi.org/10.7554/eLife.62998.sa2Article and author information
Author details
Funding
National Human Genome Research Institute (1ZIAHG200362)
 Charles N Rotimi
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official view of the National Institutes of Health. This research was supported by the Intramural Research Program of the Center for Research on Genomics and Global Health (CRGGH). The CRGGH is supported by the National Human Genome Research Institute, the National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Information Technology, and the Office of the Director at the National Institutes of Health (1ZIAHG200362).
Ethics
Human subjects: Ethics approval for the Howard University Family Study (HUFS) was obtained from the Howard University Institutional Review Board (protocol number IRB06GSAS32A) and written informed consent was obtained from each participant. All clinical investigation was conducted according to the principles expressed in the Declaration of Helsinki.
Senior Editor
 Matthias Barton, University of Zurich, Switzerland
Reviewing Editor
 Mone Zaidi, Icahn School of Medicine at Mount Sinai, United States
Reviewers
 Zané Lombard, National Health Laboratory Service & University of the Witwatersrand, South Africa
 BertJan van den Born
Publication history
 Received: September 10, 2020
 Accepted: November 13, 2020
 Version of Record published: December 1, 2020 (version 1)
Copyright
This is an openaccess article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Metrics

 237
 Page views

 23
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.