Disentangling the relationship between cancer mortality and COVID-19 in the US

  1. Chelsea L Hansen  Is a corresponding author
  2. Cécile Viboud
  3. Lone Simonsen
  1. Division of International Epidemiology and Population Studies, Fogarty International Center, National Institutes of Health, United States
  2. PandemiX Center, Dept of Science & Environment, Roskilde University, Denmark
  3. Brotman Baty Institute, University of Washington, United States

eLife assessment

This valuable work explores death coding data to understand the impact of COVID-19 on cancer mortality. The work provides solid evidence that deaths with cancer as a contributing cause were not above what would be expected during pandemic waves, suggesting that cancer did not strongly increase the risk of dying of COVID-19. These results are an interesting exploration into the coding of causes of death that can be used to make sense of how deaths are coded during a pandemic in the presence of other underlying diseases, such as cancer.

https://doi.org/10.7554/eLife.93758.3.sa0

Abstract

Cancer is considered a risk factor for COVID-19 mortality, yet several countries have reported that deaths with a primary code of cancer remained within historic levels during the COVID-19 pandemic. Here, we further elucidate the relationship between cancer mortality and COVID-19 on a population level in the US. We compared pandemic-related mortality patterns from underlying and multiple cause (MC) death data for six types of cancer, diabetes, and Alzheimer’s. Any pandemic-related changes in coding practices should be eliminated by study of MC data. Nationally in 2020, MC cancer mortality rose by only 3% over a pre-pandemic baseline, corresponding to ~13,600 excess deaths. Mortality elevation was measurably higher for less deadly cancers (breast, colorectal, and hematological, 2–7%) than cancers with a poor survival rate (lung and pancreatic, 0–1%). In comparison, there was substantial elevation in MC deaths from diabetes (37%) and Alzheimer’s (19%). To understand these differences, we simulated the expected excess mortality for each condition using COVID-19 attack rates, life expectancy, population size, and mean age of individuals living with each condition. We find that the observed mortality differences are primarily explained by differences in life expectancy, with the risk of death from deadly cancers outcompeting the risk of death from COVID-19.

eLife digest

Establishing the true death toll of a pandemic like COVID-19 is difficult, as laboratory testing is generally too limited to directly count the number of deaths that can be attributed to a particular pathogen. To overcome this, researchers analyse excess mortality – that is, they compare the observed number of deaths with the expected level based on trends in prior years. These techniques have been used for over 100 years to estimate the burden of pandemic influenza and became a popular way to estimate deaths due to the COVID-19 pandemic.

Excess mortality can also reveal the impact of COVID-19 on sub-populations with chronic conditions. For example, previous studies showed that deaths with diabetes, heart disease and Alzheimer’s disease listed as the primary cause of death increased during waves of COVID-19. Cancer deaths did not show such a pattern, however, despite some epidemiological studies identifying cancer as a risk factor for COVID-19 mortality.

To understand why this may be the case, Hansen et al. reviewed death certificates from different states in the United States during the first year of the pandemic. Their analyses of multiple-cause death records (listing cancer anywhere on the death certificate, not just as the primary cause of death) showed that death certificate coding practices during the pandemic did not explain the absence of excess cancer mortality. While a low level of excess mortality was detectable for cancers with longer life expectancy (breast cancer, for example), no elevation was observed for cancers with lower life expectancy, such as pancreatic cancer. The analyses demonstrate that the lack of excess mortality for especially deadly cancers can be explained through competing risks – in other words, the high risk of dying from the cancer itself vastly outweighs the additional risk posed by COVID-19.

These findings shed light on how competing mortality risks might mask the true impact of COVID-19 on cancer mortality and explain the apparent discrepancy between cohort studies and excess mortality studies. To fully comprehend the impact of COVID-19 on patients living with cancers, future research should look at the possibility of longer-term increases in cancer mortality due to late diagnosis during pandemic lockdowns, and an elevated risk of severe illness.

Introduction

The dominant risk factors for COVID-19 mortality have consistently been shown to be advanced age, male gender, and certain chronic diseases such as diabetes, obesity, and heart disease (Chavez-MacGregor et al., 2022; Rüthrich et al., 2021; Williamson et al., 2020). Cancer has also been identified as a high-risk condition based on case-control and cohort studies, although these studies have provided conflicting results. In a large cohort study of ~500,000 COVID-19 inpatients, only cancer patients under recent treatment were at increased risk of COVID-19-related deaths (OR = 1.7) relative to non-cancer patients (Chavez-MacGregor et al., 2022). Conversely, a smaller European study of 3000 COVID-19 inpatients found that cancer was not a risk factor (Rüthrich et al., 2021), as did an international, multicenter study of 4000 confirmed COVID-19 inpatients (Raad et al., 2023). More recently a meta-analysis of 35 studies from Europe, North America, and Asia found a twofold increased risk of COVID-19 mortality among cancer patients (Di Felice et al., 2022). Similarly, a large analysis from the UK found that the risk of COVID-19 mortality for cancer patients had declined over the course of the pandemic but remained 2.5 times higher than for non-cancer patients into 2022 (Starkey et al., 2023). Taken together, such observational studies provide a mixed picture of cancer as a COVID-19 mortality risk factor, with several studies reporting that controlling for important factors such as age is a challenge. Furthermore, cancer is often considered as a single disease category despite the diversity of conditions and patients represented.

Further evidence for the relationship between cancer and COVID-19 comes from population-level analysis of vital statistics. A recent US study showed no elevation in underlying cancer deaths concomitant with COVID-19 waves, in stark contrast to the sharp rise in mortality from other chronic diseases (Lee et al., 2023a). In several other countries, including Sweden, Italy, Latvia, Brazil, England, and Wales, underlying cancer mortality was found to be stable or decreasing during the first year of the pandemic (Alicandro et al., 2023; Fernandes et al., 2021; Gobiņa et al., 2022; Grande et al., 2022; Kontopantelis et al., 2022; Lundberg et al., 2023). Further, an excess mortality study of 240,000 cancer patients in Belgium found a 33% rise in mortality in April 2020, but concluded that this was no different from the rise observed in the general population (Silversmit et al., 2021). The apparent lack of association between cancer mortality and COVID-19 on a population-level raises the question of the true relationship between cancer and COVID-19.

The relationship between these two diseases could occur via multiple biological mechanisms. First, immunosuppression in cancer patients could increase susceptibility to SARS-CoV-2 infection and/or risk of severe clinical outcome upon infection. Conversely, immunosuppression could act as a protective factor in the face of a severe respiratory infection that kills by over-stimulating the immune system – the immune incompetence rescue hypothesis (Reichert et al., 2004). This hypothesis was put forward to explain the observed absence in excess cancer mortality during the 1968 influenza pandemic, a departure from elevated mortality seen for other high-risk conditions such as heart disease and diabetes (Reichert et al., 2004). A further mechanism that could affect the observed relationship between cancer deaths and COVID-19 is changing guidelines for establishing the primary cause of death. Coding guidelines evolved throughout the pandemic as testing for SARS-CoV-2 infection became more widespread, which presumably affected vital statistics studies.

To further elucidate the relationship between cancer mortality and COVID-19 on a population level, we analyzed US vital statistics in detail to understand the potential role of death certificate coding changes during the pandemic and explored putative differences in mortality patterns between different types of cancer. We considered death certificates listing cancer as the underlying cause (UC) of death or anywhere on the death certificate (multiple cause [MC]). Assuming there is a high propensity to attribute a primary code of COVID-19 during the pandemic in any patient with COVID-19, deaths among individuals with both cancer and COVID-19 near the time of death would be coded as UC COVID-19. However, cancer should still be captured in the MC data, and thus, analysis of MC death data should control for any changes in death certificate coding practices during the pandemic (Fedeli et al., 2024). The US provides a particularly useful case study as the timing of COVID-19 waves varied considerably between states, so that elevations in cancer deaths, should they exist, should also be heterogeneous. For comparison, we also assessed population-level excess mortality patterns for other chronic conditions such as diabetes, ischemic heart disease (IHD), kidney disease, and Alzheimer’s, for which the association with COVID-19 is less debated.

Results

Establishing patterns and timing of COVID-19-related deaths

We obtained individual ICD-10-coded death certificate data from the US for the period January 1, 2014, to December 31, 2020. We compiled time series by week and cause of death, for UC and for MC mortality. We considered 10 causes of death, including diabetes, Alzheimer’s disease, IHD, kidney disease, and six types of cancer (all-cause cancer, colorectal, breast, pancreatic, lung, and hematological; see Table 1 and Appendix 1—table 1 for a list of disease codes). We chose these specific cancers to illustrate conditions for which the 5-year survival rate is low (13% and 25%, respectively, for pancreatic and lung cancers) and high (65% and 91%, respectively, for colorectal and breast cancers) (National Cancer Institute, 2024). Hematological cancer (67% 5-year survival rate) was included because it has been singled out as a risk factor in several previous studies due to the immune suppression associated with both its malignancy and treatment (Chavez-MacGregor et al., 2022; Han et al., 2022a; Rüthrich et al., 2021; Williamson et al., 2020). To compare mortality patterns with the timing of COVID-19 pandemic waves, we accessed national- and state-level counts of reported COVID-19 cases from the Centers for Disease Control and Prevention, 2022.

Table 1
Each diagnosis group and its corresponding ICD-10 codes, number of underlying deaths, mean age in years at time of death, the percentage of deaths occurring at home, and the percentage of deaths occurring in nursing homes for 2019 and 2020.
YearDiagnosis groupICD-10 codesNo. deathsMean age, years (IQR)%Home/ER%Nursing home
2019Cancer C00-C99493,39772 (64–81)4512
Pancreatic cancer C2537,86472 (64–80)519
Colorectal cancer C18-C2042,48471 (61–82)4613
Hematological cancers C81-C9647,17474 (67–84)3511
Diabetes E10-E1470,76372 (63–82)5317
Alzheimer’s G3098,67587 (82–92)2950
2020Cancer C00-C99513,27572 (64–81)558
Pancreatic cancer C2539,89372 (65–80)616
Colorectal cancer C18-C2043,99071 (61–82)569
Hematological cancersC81-C9649,16174 (67–84)468
Diabetes E10-E1488,12471 (62–82)5815
Alzheimer’s G30115,25686 (82–92)3346

In national data, time series of COVID-19-coded death certificates (both UC and MC) tracked with the temporal patterns of laboratory-confirmed COVID-19 cases (Figure 1), revealing three distinct COVID-19 waves: a spring wave peaking on April 12, 2020, a smaller summer wave peaking on July 26, 2020, and a large winter wave that had not yet peaked by the end of the study in December 2020. This correspondence between COVID-19 case and death activity represents a ‘signature’ mortality pattern of COVID-19.

Weekly counts of death certificates listing COVID-19 as either the underlying or a multiple cause.

When included on a death certificate, COVID-19 was most often listed as the underlying cause of death rather than a contributing cause. National-level data reveal three distinct waves: Wave 1 (spring, March 1 to June 27, 2020), Wave 2 (summer, June 28 to October 3, 2020), and Wave 3 (winter, October 4 to December 6, 2020, incomplete). Vertical dashed lines represent the peak of each wave, dotted lines represent the number of reported cases (y-axis on the right). New York experienced its first large COVID-19 wave in Wave 1, while Texas had its first large wave in Wave 2 and California did not experience a large wave until Wave 3 which had not yet peaked at the end of 2020.

Analysis of state-level data reveals variable timing, intensity, and number of COVID-19 waves across the US during 2020. To focus on periods with substantial COVID-19 activity and explore the association with cancer, we identified three large US states with unique, well-defined waves (Figure 1). New York (NY) state experienced a large, early wave in March–May 2020, based on recorded COVID-19 cases and deaths and high seroprevalence of SARS-CoV-2 antibodies in this period (over 20%; Stadlbauer et al., 2021). Meanwhile, California (CA) experienced a large COVID-19 wave at the end of the year and had little activity during the spring and summer. Finally, Texas (TX) had two large waves; one during late summer, followed by one in winter 2020.

National patterns in excess mortality from cancer

Similar to other influenza and COVID-19 population-level mortality studies (Islam et al., 2021; Karlinsky and Kobak, 2021; Lee et al., 2023a; Msemburi et al., 2023), we established a weekly baseline model for expected mortality in the absence of pandemic activity by modeling time trends and seasonality in pre-pandemic data and letting the model run forward during the pandemic (see Materials and methods). Each cause of death (UC and MC) and geography (aggregated National, NY, TX, and CA) was modeled separately. We then computed excess mortality as the difference between observed deaths and the model-predicted baseline. We summed weekly estimates to calculate excess mortality for the full pandemic period and during each of the three waves (see Materials and methods). In addition to these absolute effects of the pandemic on mortality, we also calculated the relative effects by dividing excess mortality by baseline mortality. This approach has been used in the past to standardize mortality effects in strata with very different underlying risks (e.g. age groups, geographies, or causes of death, see Materials and methods).

Nationally, we found a drop in UC cancer deaths during spring 2020 (Figure 2a; Table 2), although the drop was not statistically significant. A similar non-significant decline was also seen for specific cancer types (Figure 2d–f; Appendix 1—figure 1a, f–j). Further, pre-pandemic mortality trends for each cancer type continued unabated during the first pandemic year. We reasoned that this early drop in UC cancer deaths may be explained by changes in coding practices, so we next turned to MC mortality to resolve this question.

National-level weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Pancreatic Cancer (d), Colorectal Cancer (e), Hematologic Cancer (f)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2014 to 2020.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. Baselines during the pandemic are projected based on the previous years of data.

Table 2
The estimated number of excess deaths and the percentage over baseline for each diagnosis group when listed as both the underlying cause or anywhere on the death certificate (multiple cause).

Estimates for the national-level data are provided for the full pandemic period and for each state based on when the first large wave was experienced.

Cause of deathStateWaveMultiple causeUnderlying cause
Excess deaths% Over baselineExcess deaths% Over baseline
Cancer NationalOverall13601*3.0110.0
 New York17476.0–474–5.0
 Texas24674.0390.0
 California35294.0821.0
Pancreatic cancer NationalOverall–25–0.0–282–1.0
 New York181.0–16–2.0
 Texas2172.0243.0
 California300.0–18–2.0
Colorectal cancer NationalOverall9882.0–168–0.0
 New York1919.0–16–2.0
 Texas240.0–34–3.0
 California3272.0–1–0.0
Hematological cancers NationalOverall3615*7.01110.0
 New York112110.0–107–11.0
 Texas213611.0212.0
 California31148.0202.0
Diabetes NationalOverall82,318*37.010,784*16.0
 New York15945*128.0568*40.0
 Texas24612*77.0420*23.0
 California33474*59.0575*33.0
Alzheimer’s NationalOverall21,712*19.08528*9.0
 New York1734*49.018816.0
 Texas21398*45.0805*31.0
 California3726*18.02598.0
  1. *

    Confidence interval does not include zero.

Time series of MC cancer mortality showed a significant increase in all three waves (Figure 2a; Appendix 1—table 2). A similar pattern was seen in MC time series for colorectal (Figure 2h), breast (Appendix 1—figure 1i), and hematological cancer (Appendix 1—figure 1j). However, the total excess mortality was modest with 13,600 excess cancer deaths in 2020, representing a statistically significant 3% elevation over baseline (Table 2). The largest relative increase in MC mortality was observed in hematological cancer at 7% (statistically significant, 3600 excess deaths). No excess in MC mortality was seen for the two deadliest cancers, pancreatic cancer (Figure 2f) and lung cancer (Appendix 1—figure 1g).

National patterns in deaths due to other chronic conditions

We considered diabetes and Alzheimer’s as ‘positive controls’ as they are also considered COVID-19 risk factors and can illustrate how positive associations between chronic conditions and COVID-19 manifest in population-level excess mortality studies. Diabetes provides a particularly useful comparator for cancer as the mean age-at-death is similar (~72 years, Table 1) and because few individuals live in a nursing home (Appendix 1 - Supplemental Methods). Mortality time series from UC and MC diabetes and Alzheimer’s were highly correlated with COVID-19 activity, with statistically significant mortality elevation synchronous with pandemic waves (Figure 2b and c; Appendix 1—figures 25). For diabetes, we measured an excess of 10,800 and 82,300 deaths (UC and MC, respectively), corresponding to statistically significant elevations of 16% and 37% over baseline level mortality (Table 2). For Alzheimer’s, we estimated 8500 and 21,700 excess deaths, corresponding to statistically significant elevations of 9% and 19% elevation over baseline, respectively. Pandemic-related excess mortality was also seen for IHD and kidney disease (see supplement for estimates, Appendix 1—table 2).

State-level patterns in excess mortality

Similar to patterns seen in national-level data, none of the state-level analyses revealed notable increases in UC cancer mortality, while there was a modest, non-significant increase in MC cancer mortality (Figures 35; Appendix 1—figures 68). The largest mortality increase was seen in NY during the spring wave, with a 6% rise in MC cancer mortality above the model baseline (Table 2; Appendix 1—table 3). The magnitude of the increase seen during the spring wave varied by cancer type, with minimal increases seen in pancreatic and lung cancers (1%) and higher increases in colorectal, hematological, and breast cancers (9%, 10%, and 16%, respectively). For comparison, there was a statistically significant rise in Alzheimer’s and diabetes deaths during this wave of 49% and 128%.

Weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Pancreatic Cancer (d), Colorectal Cancer (e), Hematologic Cancer (f)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2014 to 2020 in New York.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. New York experienced its first large wave of COVID-19 in spring 2020 (Wave 1).

Weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Pancreatic Cancer (d), Colorectal Cancer (e), Hematologic Cancer (f)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2014 to 2020 in Texas.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. Texas experienced its first large wave of COVID-19 in the summer of 2020 (Wave 2).

Weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Pancreatic Cancer (d), Colorectal Cancer (e), Hematologic Cancer (f)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2014 to 2020 in California.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. California did not experience a large wave of COVID-19 until the winter of 2020–2021 (Wave 3), only the first half of which is captured here.

In CA and TX, mortality fluctuations were less pronounced than in NY, coinciding with less intense COVID-19 waves, and this was seen across all conditions. MC excess mortality estimates remained within ±4% of baseline levels for cancers, irrespective of the type of cancer and pandemic wave, except for hematological cancers which saw an 11% rise in TX during the summer wave and an 8% rise in CA during the winter wave. None of these elevations were statistically significant. In comparison, there was statistically significant excess mortality elevation for both Alzheimer’s and diabetes deaths (range, 18–59% in the CA winter wave, and 45–77% in the TX summer wave, Table 2, Appendix 1—tables 4 and 5).

Demographic mortality projections under the null hypothesis that cancer in and of itself is not a risk factor for COVID-19 mortality

Next, to get a sense of the expected mortality elevation, we ran simulations to gauge the level of individual-level association (traditionally measured as relative risk [RR]) between COVID-19 and the studied chronic conditions that is consistent with the population-level excess mortality patterns observed during the pandemic. Using cancer as an example, two main factors could drive cancer mortality patterns during COVID-19, namely the size and age of the population living with cancer (since age is such a pronounced risk factor for COVID-19), and the life expectancy under cancer diagnosis. These factors would operate irrespective of the true biological relationship between COVID-19 severity and cancer. The same logic applies to mortality from other chronic conditions, such as diabetes or Alzheimer’s.

To test the hypothesis that these population factors alone could explain differences in excess mortality between chronic conditions, we designed a simple model of COVID-19 mortality for individuals with chronic conditions (see Materials and methods for details). The model projected excess mortality during the pandemic under the null hypothesis that the chronic condition was not in and of itself a risk factor for COVID-19 mortality, with only the demography of the population living with the disease (namely, the age and size of the at-risk populations and baseline risk of death from each condition) affecting excess mortality. In the demographic model, we first estimated the number of expected SARS-CoV-2 infections among persons with a certain condition, by multiplying the estimated number of US individuals living with the condition (CDC, Division of Population Health, 2022; Dhana et al., 2023; Rajan et al., 2021; U.S. Cancer Statistics Working Group, 2023) by the reported SARS-CoV-2 seroprevalence at the end of our study period (December 2020 for the national, or after each wave for the state data) (Centers for Disease Control and Prevention, 2023). We focused on seroprevalence among individuals ≥65 years, the most relevant age group for the conditions we considered (we also run a sensitivity analysis considering seroprevalence in adults 50–64 years, see Discussion). We then multiplied the estimated number of SARS-CoV-2 infections by an age-specific infection-fatality ratio (IFR) for SARS-CoV-2 (COVID-19 Forecasting Team, 2022). This gave an estimate of COVID-19-related deaths, or excess deaths, for a given condition. To estimate a percent elevation over baseline and compare with our vital statistics analysis, we divided the excess death estimate derived from the demographic model by the total deaths for that condition for a similar period of time in 2019 (see Materials and methods). We repeated this analysis for each cancer type, diabetes, and Alzheimer’s. In addition to the null hypothesis, we also projected alternative hypotheses of a biological association between chronic conditions and COVID-19, assuming that a given chronic condition would raise the risk of COVID-19 mortality (via the IFR) by a factor of 2 or 5. We compared these modeled expectations for the null and alternative hypotheses with the observed excess mortality in 2020, using MC mortality as the outcome (Table 2).

Under the null hypothesis we projected a 0–2% elevation over the 2019 baseline in deaths for all cancer types in national data, and 0–9% elevations in state-level data (Table 3). Under the alternative hypothesis that cancer increases COVID-19 mortality risk by a factor of 2, the projected elevation is 0–5% in national data and 0–18% in state-level data. In general, the largest projected increases were found in NY state, driven by the higher attack rates. We also see systematic differences in the percent elevation over baseline by type of cancer, related to the lethality of different cancers. For instance, even if cancer increases COVID-19 mortality risk by a factor of 2, we expect to see only a 0–1% increase for particularly deadly cancers such as pancreatic and lung cancer, in part driven by the high competing risk of death from these cancers (short life expectancy) and the small size of the population-at-risk. The expected increases for less deadly cancers, such as colorectal and breast, were notably higher (2–5% in national data, and 9–18% during the large spring wave in NY), in part driven by the lower risk of death from these cancers (longer life expectancy). Based on the observations from our time series analysis of MC mortality in all states, non-hematological cancers are most consistent with a one- to twofold increase in mortality, with the caveat that most of the confidence intervals include zero, and the differences in projected mortality under these hypotheses are minimal. In contrast, for hematological cancer, the observed rise in mortality exceeds the expected elevation even under the assumption of a fivefold increase in mortality.

Table 3
Projections of COVID-19-related excess mortality patterns for different cancers and chronic conditions in the US, under different hypotheses for the association between the condition and COVID-19.

Projections are provided for the null hypothesis of no biological interaction between the condition and COVID-19; these projections are solely driven by the size and mean age of the population living with each condition (where age determines the infection-fatality ratio from COVID-19), and the baseline risk of death from the condition over a similar time period (March to December 2019 for the national data, and for the states comparable dates in 2019 corresponding to the relevant COVID-19 wave). Additional projections are provided under alternative hypotheses, where each condition is associated with a relative risk (RR) of 2 or 5 for COVID-19-related death (infection-fatality ratio multiplied by 2 or 5).

Chronic conditionStatePopulation-at-riskMean ageWaveObserved MC deaths over same period in 2019Observed excess(% over baseline) in 2020Expected excess (null)Expected excess (RR = 2)Expected excess (RR = 5)
All cancersNational5,718,92565Overall546,4533 (1–4)1 (1–2)2 (1–4)6 (4–10)
New York400,89165Wave 112,2446 (−1 to 15)4 (2–10)9 (3–20)22 (8–51)
Texas397,99363Wave 212,1874 (−3 to 11)2 (1–6)5 (2–12)11 (4–29)
California599,55264Wave 316,7134 (−1 to 10)2 (0–5)4 (1–9)9 (2–23)
PancreaticNational66,31967Overall39,7980 (−6 to 7)0 (0–0)0 (0–1)1 (1–2)
New York258467Wave 19631 (−21 to 35)0 (0–1)1 (0–2)2 (1–5)
Texas226466Wave 28822 (−19 to 34)0 (0–1)0 (0–1)1 (0–3)
California348267Wave 312770 (−17 to 24)0 (0–0)0 (0–1)1 (0–2)
Lung cancerNational425,01570Overall123,6221 (−3 to 5)1 (0–1)1 (1–2)3 (2–5)
New York17,70971Wave 126431 (−13 to 20)2 (1–4)3 (1–8)8 (3–20)
Texas12,70070Wave 225132 (−11 to 20)1 (0–2)1 (1–4)4 (1–9)
California19,07970Wave 328613 (−10 to 18)1 (0–2)1 (0–3)3 (1–8)
HematologicalNational459,46362Overall57,8927 (1–13)1 (0–1)1 (1–2)3 (2–5)
New York15,57762Wave 1130510 (−11 to 40)1 (0–3)2 (1–5)6 (2–13)
Texas14,92759Wave 2123111 (−9 to 38)1 (0–1)1 (0–3)3 (1–7)
California21,29061Wave 319168 (−8 to 29)0 (0–1)1 (0–2)2 (1–5)
ColorectalNational473,26466Overall49,0532 (−4 to 8)1 (1–2)2 (1–4)6 (4–10)
New York30,85966Wave 110489 (−13 to 44)4 (2–10)9 (3–20)22 (8–51)
Texas36,64165Wave 212240 (−18 to 26)3 (1–7)5 (2–13)13 (4–33)
California51,86365Wave 315752 (−14 to 24)2 (1–5)4 (1–10)9 (3–24)
BreastNational1,097,91764Overall43,5192 (−4 to 9)2 (2–4)5 (3–8)12 (8–21)
New York74,45964Wave 198116 (−8 to 53)9 (3–21)18 (7–42)46 (17–106)
Texas77,86062Wave 210193 (−17 to 32)5 (2–12)10 (3–24)24 (8–61)
California123,43363Wave 314212 (−15 to 25)4 (1–10)8 (2–20)20 (5–51)
DiabetesNational29,105,14660Overall229,32637 (31–43)8 (5–14)16 (10–28)40 (26–69)
New York1,792,92660Wave 14804128 (104–158)30 (11–68)59 (22–136)148 (55–340)
Texas2,450,00558Wave 2589877 (61–96)17 (6–44)35 (12–87)86 (30–218)
California3514,44059Wave 3839959 (47–74)12 (3–32)25 (7–64)62 (17–160)
Alzheimer’sNational6,070,00081Overall118,99319 (11–28)28 (18–48)57 (36–96)142 (90–240)
New York426,50081Wave 1156349 (23–87)191 (70–432)381 (140–863)953 (350–2158)
Texas459,30080Wave 2297445 (27–69)63 (21–158)126 (43–315)315 (107–788)
California719,70081Wave 3539418 (6–33)39 (11–98)78 (21–196)195 (53–491)

We repeated this analysis for diabetes and Alzheimer’s (Table 3). For diabetes under the null hypothesis, we projected an 8% elevation over baseline in national data and 12–30% in state-level data based on the age distribution and substantial size of the population-at-risk alone. In fact, we observed in vital statistics analysis a 37% elevation over baseline in national US data and 59–128% in state-level data, with the largest increase seen in NY and lowest increase in CA. These observations are most consistent with a fivefold increase in mortality based on our demographic model (projected elevation 40% nationally and 62–148% at the state level). For Alzheimer’s under the null hypothesis, we projected a 28% increase over baseline nationally, and 30–191% increases at the state level, largely driven by the advanced age of the population-at-risk. In contrast, analysis of vital statistics data reveals a 19% increase nationally and 18–49% across states, which is in fact lower than the null hypothesis (we return to this surprising result in the Discussion). Strikingly, our demographic model supports that COVID-19 will manifest differently in population-level excess mortality for each of these chronic conditions, even under the null hypothesis of no biological association between viral infection and these underlying comorbidities. Overall, these projections support the idea that demography alone (age, size, and baseline mortality of the population living with each of these conditions) can explain much of the differences in absolute and relative mortality elevations seen during the pandemic across conditions like cancer, diabetes, and Alzheimer’s.

Discussion

Cancer is generally thought of as a risk factor for severe COVID-19 outcomes, yet observational studies have produced conflicting evidence. With recent availability of more detailed US vital statistics data, we used statistical time series approaches to generate excess mortality estimates for MC death data, different types of cancer, and several geographical locations during 2020. We accounted for potential changes in coding practices during the pandemic, for instance capturing a COVID-19 patient with cancer whose death may have been coded as an underlying COVID-19 death and not a cancer death. Based on MC death data, we estimated 13,600 national COVID-19-related excess cancer deaths, which aligns well with reporting on death certificate data, where 13,400 deaths are ascribed to COVID-19 in cancer patients (Appendix 1—figure 9; Fedeli et al., 2024). Yet these deaths only represent a 3% elevation over the expected baseline cancer mortality. Percent mortality elevation was measurably higher for less deadly cancers (breast and colorectal) than cancers with a poor 5-year survival (lung and pancreatic). Consistent with other studies (Chavez-MacGregor et al., 2022; Han et al., 2022b; Rüthrich et al., 2021; Williamson et al., 2020), we found that the largest mortality increase for specific cancer types was seen in hematological cancers with a 7% elevation over baseline in national data. Across the board, the largest elevations in cancer mortality were observed in the states most impacted by the first year of the COVID-19 pandemic (e.g. NY), lending support to the specificity of our excess mortality approach.

In contrast to cancer, we observed substantial COVID-19-related excess mortality for diabetes and Alzheimer’s, temporally and geographically consistent with the three-wave ‘signature’ pattern observed in reported COVID-19 cases and deaths across the US. To investigate whether demographic differences in underlying patient populations (age distribution, population size, and baseline risk of death due to chronic condition) could explain differences in excess mortality during the pandemic, we ran a simple demographic model for each condition – first assuming the condition in and of itself was not a risk factor for COVID-19-related mortality (null hypothesis). Doing so we found that the rise in cancer deaths during COVID-19 was expected to remain low compared to these other chronic conditions, largely driven by the higher risk of death from cancer itself compared to diabetes and Alzheimer’s. These demographic projections illustrate the importance of competing risks (Figure 6), where the risk of cancer death predominates over the risk of COVID-19 death in 2020. This is exacerbated in cancers with high mortality rates. For instance, even if pancreatic cancer had in fact doubled the risk of dying of COVID-19 (IFR = 4.2% vs 2.1%), we would only expect a rise in excess mortality around 0.4% during the pandemic (Table 3), while the 2019 baseline risk of death for pancreatic cancer itself is over 60% (Figure 6). On the other hand, for conditions with a lower baseline level mortality, such as diabetes, we expect substantial COVID-19-driven elevations in mortality.

Illustration of competing risks.

Based on our demographic model, we expect a small increase in cancer mortality relative to diabetes and Alzheimer’s due to the higher competing risk of death from cancer compared to COVID-19. Panel (a) shows the log of the baseline mortality rate (based on observed mortality in 2019) from each condition on the x-axis and the log of the expected excess mortality (elevation over baseline) on the y-axis. Chronic conditions are shown in colors while states are shown in different shapes. Pancreatic cancer, the deadliest cancer considered, is on the bottom right (highest baseline mortality, lowest expected excess) while diabetes and Alzheimer’s are on the top left (lowest baseline mortality, highest expected excess). Panel (b) shows the baseline number of deaths per 100 persons at risk for each condition expected from March to December 2020 (based on deaths over this same period in 2019, orange dots) compared to the expected number of deaths due to COVID-19 under the null hypothesis (blue dots). The null hypothesis stipulates that there is no biological association between any of these chronic diseases and COVID-19. For diabetes and Alzheimer’s, the baseline risks of death are similar to the risk of death from COVID-19, while risk of death from cancer outcompetes risk of COVID-19 death for all types of cancer.

Our analysis revealed interesting differences between types of cancers. Both nationally and at the state level, the observed excess mortality for non-hematological cancers was consistent with a one- to twofold increase in COVID-19 mortality risk in patients with these types of cancer. Importantly, our analysis ignores any behavioral effects associated with the pandemic. It is conceivable that cancer patients may have shielded themselves from COVID-19 more than the average person in 2020. Our projections assume an average risk of infection for a typical individual over 65 years as there is no serologic data on infection attack rates for specific clinical population subgroups (of any age). If shielding from exposure to SARS-CoV-2 was high among cancer patients, our projections of cancer excess mortality during the pandemic would be inflated. In other words, if shielding was particularly pronounced, cancer may conceivably be a higher risk factor than shown here. Retrospective serologic analysis of banked sera from the first year of the pandemic, broken down by underlying comorbidities, may shed light on whether infection risk may have varied by chronic condition.

State-level mortality patterns can potentially provide complementary insights on the question of shielding. Because NY state experienced the earliest and most intense COVID-19 wave of the US, with over 20% of the population infected in spring 2020 (Stadlbauer et al., 2021), and because social distancing did not come into effect until March 2020, shielding would have had a more limited impact there than in other states. Thus, a biological relationship between cancer and COVID-19 would have been most dramatic in NY in spring 2020. Indeed, cancer excess mortality was exacerbated in NY, including an 9–16% increase in colorectal and breast cancer mortality, consistent with a twofold increase in COVID-19 mortality risk from these cancers, and a 10% increase in hematological cancers, consistent with a fivefold increase in COVID-19 mortality risk. In NY, the absence of excess mortality in lethal cancers, such as pancreatic and lung cancers (1% over baseline) are, as discussed above, still consistent with what would be expected under a high competing risk situation.

We used diabetes and Alzheimer’s as positive controls for a known biological association between COVID-19 and chronic conditions. Diabetes stood out in our analyses with the highest absolute and relative increases in excess mortality during the pandemic. The magnitude of the mortality increases, both nationally and at the state level, were close to what would be expected if diabetes increased COVID-19 mortality by fivefold. Many studies have shown that diabetes increases the risk of COVID-19 mortality, with an effect size around 2 (Williamson et al., 2020; Huang et al., 2020; Kastora et al., 2022). Impaired immune function and chronic inflammation have been identified as mechanisms driving poor outcomes for diabetes patients (Figueroa-Pizano et al., 2021). The discrepancy between the observed excess and our expectations may come down to uncertainty in the SARS-CoV-2 infection rates assumed in our demographic model. The population living with diabetes is slightly younger than that of the other conditions (mean age, 58–60 years), while we used serologic infection rates reported for individuals over 65 years in our main analysis. The SARS-CoV-2 attack rate among those 50–64 years was 10.1% at the end of 2020, compared to 6.3% in individuals over 65 (Centers for Disease Control and Prevention, 2023). A sensitivity analysis using this higher attack rate in our demographic model lends more support to the hypothesis that diabetes increases COVID-19 mortality by twofold, rather than fivefold as found in our main analysis.

Our second positive control, Alzheimer’s, revealed surprising results. Although we observed significant excess mortality in MC Alzheimer’s data, it was still less than expected under the null hypothesis that Alzheimer’s was not a risk factor for COVID-19 mortality. This is unexpected in light of several observational studies that have shown Alzheimer’s to be a risk factor (Tahira et al., 2021; Wang et al., 2021; Zhang et al., 2021). As with cancer and diabetes, there is uncertainty in the SARS-CoV-2 infection rates used in the demographic model, due to the potential effect of shielding and the age-specific SARS-CoV-2 infection risk of the Alzheimer’s population. We estimated that the average age of the population living with Alzheimer’s disease was 80–81 years, and the infection rates for the general population over 65 years may not accurately reflect exposure in this subpopulation. Decreasing the attack rates by 20–30% (down to 4.4–5.0%) puts the observed estimates in the range of the expectations under the null hypothesis. Overall, given uncertainty in SARS-CoV-2 attack rates and the age and size of the population-at-risk for all studied conditions, our demographic model projections are not an exact tool to titrate excess mortality nor the RR associated with each condition. Our model merely serves as an illustration of the role of demography and competing risks.

Most vital statistics studies of the COVID-19 pandemic have relied on UC-specific deaths, which are prone to changes in coding practices. Our initial hypothesis going into this work was that coding changes associated with a better recognition of the impact of SARS-CoV-2 led to an underestimation of excess mortality from cancer, affecting our perception of the relationship between cancer and COVID-19. We certainly found an effect of coding changes, where for instance a drop in excess mortality in underlying cancer deaths turned into an increase in MC (any-listed) cancer deaths, particularly in the first COVID-19 pandemic wave. A similar observation was made by Fedeli et al., 2024. The impact of coding changes was also seen in mortality from other chronic conditions but was particularly important for cancer. Yet both the absolute and relative excess mortality elevation remained modest for cancer, even after adjustment for coding changes, highlighting the importance of additional mechanisms such as competing mortality risks between COVID-19 and cancer.

An interesting hypothesis was put forward 20 years ago proposing that immunosuppression from cancer may explain the lack of excess cancer mortality in the 1968 influenza pandemic – the immune incompetence rescue hypothesis (Reichert et al., 2004). This hypothesis contends that it is a detrimental immune response that leads to influenza death. A similar hypothesis was put forward to explain the extreme mortality in young healthy adults in the 1918 pandemic (Short et al., 2018). However, observational studies have found that patients with hematological cancers have twice the risk of dying compared to patients without cancer, likely due to the immunosuppression associated with their malignancy and treatment (Han et al., 2022a; Starkey et al., 2023; Williamson et al., 2020). Under the immune incompetence rescue hypothesis, hematological cancers would be expected to have the lowest excess mortality of all types of cancers. Our excess mortality analysis reveals instead that hematological cancers were the most impacted by the pandemic, relative to other types of cancer, with observed mortality patterns consistent with a fivefold increase in risk of COVID-19 death in patients with hematological cancers. Overall, we do not find any support for the immune competence rescue hypothesis.

Our study is subject to limitations. First, we did not study the potential long-term consequences of the pandemic on cancer care, which may have resulted in avoidance of the healthcare system for diagnosis or treatment. We did not see any delayed pandemic effect on mortality from pancreatic cancer, which may have manifested in 2020 given the very low survival rate of this cancer (Lemanska et al., 2023), but we cannot rule out longer-term effects on breast or colorectal cancers that would not be seen until 2021 or later (Doan et al., 2023; Han et al., 2023; Haribhai et al., 2023; Lee et al., 2023b; Nascimento de Lima et al., 2023; Nickson et al., 2023; Nonboe et al., 2023; Tope et al., 2023). Interestingly, in the US, all-cause underlying cancer mortality rates do not appear to rise between 2020 and 2023 (Appendix 1—figure 10), but data prior to the pandemic show a rise in cancer incidence, largely driven by increasing cancer rates in younger adults (Han et al., 2023; Siegel et al., 2024). Additional years of data will be important to evaluate the long-term impacts of the COVID-19 pandemic and these changing demographics on cancer mortality rates. Additional years of data will also be important for assessing the impact of vaccination on the relationship between cancer and COVID-19; there is evidence that vaccines may be less immunogenic in patients with cancer compared to those without (Seneviratne et al., 2022). Another limitation of our study is the reliance on mortality as an outcome, and not the risk of COVID-19-related hospitalization and morbidity, and Long COVID in cancer patients. A small US study reported that 60% of cancer patients suffered Long COVID symptoms (Dagher et al., 2023). Future analyses using hospitalization data and electronic medical records may provide additional insights on how different cancer stages or other comorbidities may contribute to increased risk of severe COVID-19 outcomes. Lastly, a few methodological limitations are worth raising. Though it was important to assess excess mortality in state-level data because of asynchrony in pandemic waves, confidence intervals in state-level estimates were large, particularly for specific types of cancers, affecting significance levels. Additional methodological limitations relate to our demographic model, especially as regards assumptions about SARS-CoV-2 infection rates in populations of different ages and with different chronic conditions. Importantly, our conclusions regarding the importance of competing risks are robust to these assumptions. Lastly, our study is a time-trend analysis and – like cohort and case-control studies – correlation does not necessarily imply causation. However, the intensity and brevity of COVID-19 pandemic waves in space and time lends support to our analyses.

Conclusion

Our detailed excess mortality study considered six cancer types and found that there is at most a modest elevation in cancer mortality during the COVID-19 pandemic in the US. Our results demonstrate the importance of considering MC-of-death records to accurately reflect changes in coding practices associated with the emergence of a new pathogen. In contrast to earlier studies, we propose that lack of excess cancer mortality during the COVID-19 pandemic reflects the competing mortality risk from cancer (especially for deadly types like pancreatic and lung cancers) itself rather than protection conferred from immunosuppression. We note the more pronounced elevation in mortality from hematological cancers during the pandemic, compared to other cancers and to expectations from a demographic model, which aligns with a particular group of cancer patients singled out in several cohort studies. Future research on the relationship between COVID-19 and cancer should concentrate on additional outcomes, such as excess hospitalizations, Long COVID, changes in screening practices during COVID-19, and longer-term patterns in cancer mortality.

Materials and methods

Data sources

US National vital statistics

Request a detailed protocol

We obtained individual ICD-10-coded death certificate data with exact date of death from the US for the period January 1, 2014, to December 31, 2020. Each death certificate has one underlying cause (UC) of death, defined as the disease or injury that initiated the train of events leading directly to death, and up to 20 causes of death in total, referred to here as multiple-cause (MC) mortality. We considered 10 conditions, including diabetes, Alzheimer’s disease, IHD, kidney disease, and six types of cancer (all-cause cancer, colorectal, breast, pancreatic, lung, and hematological; see Table 1 and Appendix 1—table 1 for a list of disease codes). We chose these types of cancer to illustrate conditions for which the 5-year survival rate is low (13% and 25%, respectively, for pancreatic and lung cancers) and high (65% and 91%, respectively, for colorectal and breast cancers) (National Cancer Institute, 2024). Hematological cancer (67% 5-year survival) was included because it was singled out as a risk factor by previous studies. We compiled time series by week, geography (aggregated National, NY, TX, and CA), and cause of death, separately for UC and MC mortality.

To observe longer-term trends in later years of the COVID-19 pandemic, we downloaded aggregated weekly level data from 2021 to 2023 for all-cause cancer, diabetes, and Alzheimer’s disease from CDC Wonder.

Estimated populations living with each chronic condition

Request a detailed protocol

We estimated the size of the population-at-risk for all-cause and specific cancers using the 5-year limited duration prevalence estimates provided by the US Cancer Statistics webpage (U.S. Cancer Statistics Working Group, 2023). Estimates for diabetes were drawn from CDC’s Behavioral Risk Factor Surveillance System Chronic Disease Indicators (CDC, Division of Population Health). Estimates for Alzheimer’s disease were taken from publications from the Alzheimer’s Association (Rajan et al., 2021; Dhana et al., 2023).

For each condition, age-specific prevalence data were tabulated for the US and for each state separately. For cancer, age-level data were only available at the national level so these age-specific prevalence estimates were applied to the populations for each of the three states considered (NY, CA, TX). Age-level data were provided for all ages for cancer (<20 years, 20–80 years in 10-year groupings, ≥80 years), for adults ≥18 for diabetes (18–44 years, 45–64 years, ≥65 years), and for adults ≥65 for Alzheimer’s disease (65–74 years, 75–84 years, ≥85 years). A weighted mean age for the population-at-risk for each condition was calculated using the mid-point for each age group.

Other data sources

Request a detailed protocol

To compare vital statistics patterns with COVID-19 surveillance data, we accessed national and state counts of laboratory-confirmed COVID-19 cases in 2020, from the CDC (Centers for Disease Control and Prevention, 2022).

To clarify the expected role of COVID-19 on excess mortality, we compiled data on the proportion of the population with serologic evidence of SARS-CoV-2 infection from the CDC dashboard (Centers for Disease Control and Prevention, 2023). We further compiled data on estimated age-specific IFRs from COVID-19, provided by single year of age (COVID-19 Forecasting Team, 2022).

Statistical approach

Weekly excess mortality models

Request a detailed protocol

Similar to other influenza and COVID-19 excess mortality studies (Islam et al., 2021; Karlinsky and Kobak, 2021; Lee et al., 2023a; Msemburi et al., 2023), we established a predicted baseline of expected mortality for each time series, and computed the excess mortality as the excess in observed deaths over this baseline. To establish baselines for each disease nationally and in each state, we applied negative binomial regression models to weekly mortality counts for each cause of death, smoothed with a 5-week moving average and rounded to the nearest integer. Models included harmonic terms for seasonality, time trends, and an offset for population size. For each condition and location, we used Akaike information criterion (AIC) to select between three models with different time trends (see Appendix 1 - Supplemental Methods, Appendix 1—figure 11, for the final model selection for each location and condition), following:

Model 1:

Weekly_mortality = t + cos(2πt/52.17)+sin(2πt/52.17)+offset(log(population)), where t represents week.

Model 2:

Weekly_mortality = t + t2+cos(2πt/52.17)+sin(2πt/52.17)+offset(log(population)), where t represents week.

Model 3:

Weekly_mortality = t + t2+t3+cos(2πt/52.17)+sin(2πt/52.17)+offset(log(population)), where t represents week.

We fitted national- and state-level models for each mortality outcome from January 19, 2014, to March 1, 2020, and projected the baseline forward until December 6, 2020, the last complete week of smoothed mortality data. Models were fitted using the MASS package in R version 4.3.

Using COVID-19-coded death certificates from March 1, 2020, to December 6, 2020, we established the timing of each pandemic wave from trough to trough. We found that nationally, the first wave occurred from March 1, 2020, to June 27, 2020; the second wave from June 28, 2020, to October 3, 2020, and the third from October 4, 2020, to December 6, 2020 (the third wave was not completed by the last week of available smoothed data on December 6, 2020). For NY, the pandemic pattern was characterized by an intense first wave in spring 2020, while TX had its major wave in summer 2020 and CA in late 2020. Comparison of mortality patterns from these three states provides an opportunity to separate the effect of SARS-CoV-2 infection from that of behavioral changes later in the pandemic. For instance, the effects of healthcare avoidance would predominate in CA or TX in spring 2020, as there was little SARS-CoV-2 activity but much media attention on COVID-19, with cancer patients potentially avoiding medical care out of fear of getting infected. In contrast, risk of infection would dominate in NY in spring 2020, and behavioral factors may only play a role as SARS-CoV-2 awareness increased and the wave was brought under control by social distancing.

We estimated weekly excess mortality by subtracting the predicted baseline from the observed mortality. We summed weekly estimates to calculate excess mortality for the full pandemic period and for each of the three waves within the first year of the pandemic. In addition to estimating the absolute effects of the pandemic on mortality, we also calculated relative effects by dividing excess deaths in each diagnosis group by the model baseline. Confidence intervals on excess mortality estimates were calculated by resampling the estimated model coefficients 10,000 times using a multivariate normal distribution and accounting for negative binomial errors in weekly mortality counts.

We used Pearson correlation to test synchronicity patterns in weekly excess mortality from different cancers and chronic conditions to underlying COVID-19 deaths. Correlation analysis assumes a direct and immediate effect of COVID-19 on cancer mortality. We also investigated the possibility of delayed effects or harvesting by inspecting the time series for evidence of such effects and by comparing total excess deaths for distinct pandemic waves and the whole of 2020.

Projections of excess mortality under the null hypothesis of no specific COVID-19 mortality risk of each condition

Request a detailed protocol

To further test the impact of age on the association between chronic conditions and COVID-19 and clarify the additional risk due to each chronic condition, we projected the number of COVID-19 deaths under the null hypothesis that demographic characteristics alone (size, age, and baseline mortality risk for each condition) are driving excess mortality, and that there is no interaction between the condition and SARS-CoV-2 infection. Excess mortality projections were then compared with observed excess mortality. We only used MC deaths for this approach to account for the possibility that some individuals may suffer from multiple conditions. For example, an estimated 11.5% of US adults with type 2 diabetes also have a history of cancer (Yeh et al., 2018).

We first calculated the number of expected COVID-19 infections among persons living with a certain chronic condition, by multiplying the estimated number of individuals living with the condition by the reported SARS-CoV-2 seroprevalence among individuals ≥65 years at specific time points during 2020. For the national data and CA, we used results from the survey conducted from November 23 to December 12, 2020. For NY we used estimates from the survey conducted from July 27 to August 13, 2020 (the earliest data available). And for TX we used the survey conducted from October 5–19, 2020 (following the large summer wave) (Centers for Disease Control and Prevention, 2023). We then multiplied this by the COVID-19 IFR based on the estimated mean age of individuals living with the condition (COVID-19 Forecasting Team, 2022) to arrive at the projected number of COVID-19-related excess deaths for a particular condition during 2020. We put uncertainty intervals around these estimates using the lower and upper bounds from the estimated attack rates and COVID-19 IFRs.

To obtain a relative metric of expected COVID-19 burden, we divided projected COVID-19 excess deaths by total deaths in each diagnosis group in the 2019 baseline period (March to December 2019, for the national data. For the states we used the months in 2019 corresponding to their large waves in 2020), resulting in an expected percentage elevation over baseline in 2020. We compared this null expectation to the observed percentage elevation over baseline from our excess mortality models. We also generated the expected number of excess deaths under alternative hypotheses where each condition is associated with a two- or fivefold increased risk of COVID-19-related death given infection (i.e. the baseline age-adjusted IFR used in the null hypothesis was increased two- or fivefold).

The equation for the expected percent increase in excess mortality over baseline deaths under the null hypothesis, for a specific risk condition (cancer, diabetes, Alzheimer) and time period, can be written as:

Expected percent increase in excess mortality for a chronic condition and time period = (size of population-at-risk for the condition * SARS-CoV-2 infection rate for the period * age-specific IFR)/baseline mortality for the condition in comparable period in 2019.

The expected mortality increases under the alternative hypothesis of a two- or fivefold increased risk of COVID-19 death from the condition under study is modeled by multiplying the right-hand side of the above equation by the increased risk (i.e. we assume that presence of the underlying condition will increase the IFR by two- or fivefold compared to the IFR for the general population).

Appendix 1

Supplemental methods

Model selection and cross-validation

Time series models included harmonic terms for seasonality, time trends, and an offset for population size. For each condition and location, we used AIC to select between three models with different time trends. The starting model (Model 1) included only a linear time trend. We then tested this against a model with linear and quadratic time trends (Model 2). If the AIC of Model 2 was not 2 less than Model 1, Model 1 was used as the final model. If the AIC of Model 2 was 2 less than Model 1, then Model 2 was tested against a model with linear, quadratic, and cubic time trends (Model 3). If the AIC of Model 3 was not 2 less than Model 2, then Model 2 was taken as the final model. If the AIC of Model 3 was 2 less than Model 2, Model 3 was taken as the final model. The final model for each condition and location was then applied to the data from 2014 to 2018 only and used to predict the 2019 data. The coverage probability was calculated as the proportion of weeks of observed data in 2019 that fell within the 95% prediction interval of the time series model. The final model selected for each condition and location is provided in Appendix 1—figure 11.

Characteristics of cancer, diabetes, and Alzheimer’s deaths in the pre-pandemic period

For each chronic condition studied (cancer, diabetes, Alzheimer’s), we assessed potential changes in the characteristics of deaths during the pandemic period that are unrelated to timing but may signal an association with COVID-19. For instance, age is known to be a major risk factor for COVID-19 mortality. For each chronic condition, we computed the average age-at-death in the pre-pandemic year 2019, and compared this to the average age-at-death in 2020. The second potential confounder is living arrangement, as individuals living in nursing homes may be at increased risk of exposure (and death) to COVID-19 due to mixing, even if their underlying condition is not per se a risk factor. To test this hypothesis, we also compared the proportion of individuals in each disease group who died in nursing homes in 2019 and 2020. And finally, to illustrate the impact of coding practices we compared ICD-10 letter categories between 2020 and 2019 for the UC of death when cancer or diabetes are included on the death certificate but are not listed as the UC of death (Appendix 1—figure 9). For 2020, we further compared death certificates listing both COVID-19 and cancer to those listing both COVID-19 and diabetes. For all comparisons between 2019 and 2020 data are limited to March to December to isolate the pandemic period.

Appendix 1—table 1
Diagnosis groups and corresponding ICD-10 codes, number of underlying and multiple cause deaths, mean age in years at time of death, the percentage of deaths occurring at home, and the percentage of deaths occurring in nursing homes for 2019 and 2020.
Underlying causeMultiple cause
YearDiagnosis groupICD-10 codesNo. deathsMean age, years (IQR)%Home/ER%Nursing homeNo. deathsMean age, years (IQR)%Home/ER%Nursing home
2019CancerC00-C99493,39772 (64–81)4512546,45372 (64–82)4413
Pancreatic cancerC2537,86472 (64–80)51939,79872 (64–80)509
Lung cancerC34114,55272 (65–80)4512123,62272 (65–80)4412
Colorectal cancerC18-C2042,48471 (61–82)461349,05372 (62–83)4514
Breast cancerC5035,11569 (59–81)441343,51971 (61–83)4315
Hematological cancerC81-C9647,17474 (67–84)351157,89274 (67–84)3512
DiabetesE10-E1470,76372 (63–82)5317229,32674 (65–84)4619
Alzheimer’sG3098,67587 (82–92)2950118,99387 (82–92)2948
Ischemic heart diseaseI20-I25292,65977 (67–88)5018440,22577 (68–87)4718
Kidney diseaseN00-07, 17–19,25-2846,12076 (68–87)2518189,93876 (67–87)2015
2020CancerC00-C99513,27572 (64–81)558586,50372 (64–82)529
Pancreatic cancerC2539,89372 (65–80)61642,38372 (65–80)606
Lung cancerC34115,55472 (65–80)548127,67172 (65–80)538
Colorectal cancerC18-C2043,99071 (61–82)56952,31972 (62–83)5310
Breast cancerC5036,29670 (60–81)541047,09472 (62–83)5112
Hematological cancerC81-C9649,16174 (67–84)46864,84074 (68–84)439
DiabetesE10-E1488,12471 (62–82)5815343,06173 (65–83)4516
Alzheimer’sG30115,25686 (82–92)3346151,20686 (82–92)3147
Ischemic heart diseaseI20-I25327,85476 (67–88)5416533,20477 (68–87)4916
Kidney diseaseN00-07, 17–19,25-2849,79676 (68–87)3015255,70875 (67–86)2112
Appendix 1—table 2
Estimated Excess Deaths by Cause and Wave (National).

Estimated number of excess deaths and the percentage over baseline for each diagnosis group (National). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Cause of deathWaveMultiple causeUnderlying cause
Excess deaths% Over baselineExcess deaths% Over baseline
CancerOverall13,601*3.0110.0
1790.0–3917*–2.0
26519*4.026622.0
37003*6.012661.0
Pancreatic cancerOverall–25–0.0–282–1.0
1–213–1.0–281–2.0
2440.0–30–0.0
31441.0290.0
Lung cancerOverall11021.0–814–1.0
1–729–1.0–1221–3.0
27842.02491.0
310474.01581.0
Breast cancerOverall8382.0–438–1.0
1–66–0.0–415–3.0
24373.0811.0
34675.0–105–1.0
Colorectal cancerOverall9882.0–168–0.0
1–169–1.0–463–3.0
24543.01121.0
3703*6.01832.0
Hematological cancersOverall3615*7.01110.0
15462.0–447–2.0
21412*8.04123.0
31657*12.01461.0
DiabetesOverall82,318*37.010,784*16.0
125,306*25.02305*7.0
227,534*38.04330*20.0
329,477*56.04148*26.0
Alzheimer’sOverall21,712*19.08528*9.0
14763*9.05471.0
28054*22.04257*14.0
38894*33.03724*16.0
Ischemic heart diseaseOverall58,793*14.017,194*6.0
112,042*6.08621.0
221,783*16.07912*9.0
324,967*25.08419*13.0
Kidney diseaseOverall41,907*22.07852.0
18182*10.0–1048–5.0
214,767*25.07775.0
318,958*44.01056*10.0
  1. *

    Confidence interval does not include zero.

Appendix 1—table 3
Estimated Excess Deaths by Cause and Wave (New York).

Estimated number of excess deaths and the percentage over baseline for each diagnosis group (New York). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Cause of deathWaveMultiple causeUnderlying cause
Excess deaths% Over baselineExcess deaths% Over baseline
CancerOverall10124.0–557–2.0
17476.0–474–5.0
21201.0-6–0.0
31442.0–77–1.0
Pancreatic cancerOverall–29–1.0–58–3.0
181.0–16–2.0
2-1–0.0-9–1.0
3–37–6.0–33–6.0
Lung cancerOverall471.0–163–3.0
1271.0–143–7.0
2231.0161.0
3-3–0.0–36–3.0
Breast cancerOverall2059.0–46–2.0
115116.0–34–4.0
2314.030.0
3234.0–15–3.0
Colorectal cancerOverall1898.0422.0
1919.0–16–2.0
2405.0264.0
3589.0336.0
Hematological cancersOverall1565.0–149–6.0
112110.0–107–11.0
210.0–25–3.0
3355.0–18–3.0
DiabetesOverall7240*66.0866*26.0
15945*128.0568*40.0
2631*18.012111.0
3664*24.017721.0
Alzheimer’sOverall884*26.02339.0
1734*49.018816.0
210.010.0
315017.0446.0
Ischemic heart diseaseOverall7118*25.03756*17.0
16607*54.04092*44.0
21792.0–184–3.0
33315.0–152–3.0
Kidney diseaseOverall2438*34.0513.0
11946*63.0223.0
21446.0–13–2.0
3349*19.0428.0
  1. *

    Confidence interval does not include zero.

Appendix 1—table 4
Estimated Excess Deaths by Cause and Wave (Texas).

Estimated number of excess deaths and the percentage over baseline for each diagnosis group (Texas). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Cause of deathWaveMultiple causeUnderlying cause
Excess deaths% Over baselineExcess deaths% Over baseline
CancerOverall6022.0–130–0.0
1–48–0.0–62–0.0
24674.0390.0
31832.0–107–1.0
Pancreatic cancerOverall10.050.0
1–36–3.0–36–4.0
2172.0243.0
3203.0173.0
Lung cancerOverall1762.01082.0
1331.0311.0
2602.0271.0
3845.0493.0
Breast cancerOverall–19–1.0–131–5.0
1–54–4.0–54–6.0
2293.0–25–3.0
361.0–51–8.0
Colorectal cancerOverall–12–0.0–92–3.0
1–33–2.0–49–4.0
240.0–34–3.0
3172.0–10–1.0
Hematological cancersOverall1945.0–12–0.0
1242.010.0
213611.0212.0
3333.0–34–4.0
DiabetesOverall8902*49.061811.0
11411*19.0613.0
24612*77.0420*23.0
32879*62.01389.0
Alzheimer’sOverall2242*24.0118415.0
13098.01976.0
21398*45.0805*31.0
3536*21.01818.0
Ischemic heart diseaseOverall6018*20.017009.0
17366.0991.0
23376*34.01228*19.0
31905*24.03747.0
Kidney diseaseOverall6724*47.057919.0
1886*15.01159.0
23535*76.0285*28.0
32303*66.017923.0
  1. *

    Confidence interval does not include zero.

Appendix 1—table 5
Estimated Excess Deaths by Cause and Wave (California).

Estimated number of excess deaths and the percentage over baseline for each diagnosis group (California). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Cause of deathWaveMultiple causeUnderlying cause
Excess deaths% Over baselineExcess deaths% Over baseline
 CancerOverall9912.0–29–0.0
1–102–1.0–236–1.0
25643.01251.0
35294.0821.0
Pancreatic cancerOverall–97–3.0–126–4.0
1–28–2.0–39–3.0
2–69–5.0–70–6.0
300.0–18–2.0
Lung cancerOverall–10–0.0–132–2.0
1–82–3.0–96–3.0
2181.0–48–2.0
3543.0131.0
Breast cancerOverall672.0–22–1.0
1–44–3.0–34–3.0
2926.0444.0
3202.0–33–4.0
Colorectal cancerOverall1002.0201.0
170.0–4–0.0
2664.0252.0
3272.0–1–0.0
Hematological cancersOverall2795.0521.0
100.0–33–2.0
21649.0644.0
31148.0202.0
DiabetesOverall9163*39.01408*20.0
11843*18.02137.0
23846*49.0620*27.0
33474*59.0575*33.0
Alzheimer’sOverall2143*14.05945.0
13756.0–76–1.0
21041*20.04109.0
3726*18.02598.0
Ischemic heart diseaseOverall5905*16.02888*11.0
16504.01041.0
22966*24.01581*19.0
32289*25.01204*19.0
Kidney diseaseOverall3858*21.080.0
13014.0–114–8.0
21967*33.0636.0
31590*36.0597.0
  1. *

    Confidence interval does not include zero.

Appendix 1—figure 1
National-level weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Ischemic Heart Disease (d), Kidney Disease (e), Pancreatic Cancer (f), Lung Cancer (g), Colorectal Cancer (h), Breast Cancer (i), Hematologica Cancer (j)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. Baselines during the pandemic are projected based on the previous years of data.

Appendix 1—figure 2
Correlation between weekly number of COVID-19-coded deaths and excess underlying deaths for each diagnosis group (National).
Appendix 1—figure 3
Correlation between weekly number of COVID-19-coded deaths and excess multiple cause deaths for each diagnosis group (National).
Appendix 1—figure 4
Correlation between weekly number of COVID-19-coded deaths and excess underlying deaths for each diagnosis group (New York).
Appendix 1—figure 5
Correlation between weekly number of COVID-19-coded deaths and excess underlying deaths for each diagnosis group (New York).
Appendix 1—figure 6
Weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Ischemic Heart Disease (d), Kidney Disease (e), Pancreatic Cancer (f), Lung Cancer (g), Colorectal Cancer (h), Breast Cancer (i), Hematologica Cancer (j)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020 in New York.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. Baselines during the pandemic are projected based on the previous years of data.

Appendix 1—figure 7
Weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Ischemic Heart Disease (d), Kidney Disease (e), Pancreatic Cancer (f), Lung Cancer (g), Colorectal Cancer (h), Breast Cancer (i), Hematologica Cancer (j)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020 in Texas.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. Baselines during the pandemic are projected based on the previous years of data.

Appendix 1—figure 8
Weekly observed and estimated baseline mortality for each diagnosis group (Cancer (a), Diabetes (b), Alzheimer’s (c), Ischemic Heart Disease (d), Kidney Disease (e), Pancreatic Cancer (f), Lung Cancer (g), Colorectal Cancer (h), Breast Cancer (i), Hematologica Cancer (j)) as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020 in New York.

Red dashed lines represent the timing of the peaks for the three COVID-19 waves in 2020. Baselines during the pandemic are projected based on the previous years of data.

Appendix 1—figure 9
Comparison of ICD-10 letter categories between 2020 and 2019 for the underlying cause (UC) of death when cancer or diabetes are included on the death certificate, but are not listed as the UC of death.

For both cancer and diabetes, I codes (diseases of the circulatory system) make up the majority of underlying deaths. The most notable difference between 2019 and 2020 is the increase in U codes, which includes COVID-19 (U071). In total there were 13,434 deaths ascribed to COVID-19 (UC deaths) among cancer multiple cause (MC) deaths. COVID-19 was included in <3% of all cancer deaths and 17% of diabetes deaths. In both cases it was listed as the UC on the majority of death certificates where it was included (81% and 97% for cancer and diabetes, respectively).

Appendix 1—figure 10
Post-2020 trends in cancer, diabetes, and Alzheimer’s mortality.

Aggregated weekly data was downloaded from CDC Wonder. Trends in cancer mortality rate appear stable in the national data and in Texas and California, but decreasing in New York. The diabetes mortality rate is higher post-2020 compared to earlier years across all states. Alzheimer’s appears stable and slowly decreasing.

Appendix 1—figure 11
For each condition three time series models with different time trends were considered (see Materials and methods).

The final model for each condition and location is indicated in blue. The final model was fit to 2014–2018 data only and used to predict the 2019 data. A coverage proportion (shown in white) was calculated as the proportion of observed 2019 data that fell within the projection intervals of the model. For all causes of death and states (except multiple cause [MC] kidney disease in California) the coverage proportion was 1, indicating that all data points fell within the prediction intervals.

Data availability

Individual-level mortality data with exact date of death were obtained from the National Center for Healthcare Statistics (NCHS). Individual-level data with exact date of death are not publicly available due to privacy concerns, but descriptive characteristics have been summarized in Table 1 and Appendix 1—table 1. Researchers wishing to access these data must submit an application to NCHS following the instructions provided here: https://www.cdc.gov/nchs/nvss/nvss-restricted-data.htm. Individual-level data without the exact data of death are publicly available and can be downloaded from here: https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm. All analyses shown in this paper can be replicated based on the aggregated data and code posted at the following public GitHub repository: https://github.com/chelsea-hansen/Disentangling-the-relationship-between-cancer-mortality-and-COVID-19 (copy archived at Hansen, 2024). Additional weekly, aggregated mortality data for trends in cancer, diabetes, and Alzheimer’s mortality post-2020 are publicly available through CDC Wonder and have been included in the GitHub repository. Data used for the demographic model were gathered from publicly available sources. These data, along with the code for the model, have also been posted to the GitHub repository. Weekly, state-level data on recorded COVID-19 cases and deaths are publicly available. Data were downloaded from here: https://data.cdc.gov/Case-Surveillance/Weekly-United-States-COVID-19-Cases-and-Deaths-by-/pwn4-m3yp and have also been posted as a .csv file to the GitHub repository.

References

    1. Yeh HC
    2. Golozar A
    3. Brancati FL
    (2018)
    Diabetes in America
    Chapter 29: Cancer and diabetes, Diabetes in America, Bethesda, National Institute of Diabetes and Digestive and Kidney Diseases (US).

Article and author information

Author details

  1. Chelsea L Hansen

    1. Division of International Epidemiology and Population Studies, Fogarty International Center, National Institutes of Health, Bethesda, United States
    2. PandemiX Center, Dept of Science & Environment, Roskilde University, Roskilde, Denmark
    3. Brotman Baty Institute, University of Washington, Seattle, United States
    Contribution
    Data curation, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    chelsea.hansen@nih.gov
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4526-6772
  2. Cécile Viboud

    Division of International Epidemiology and Population Studies, Fogarty International Center, National Institutes of Health, Bethesda, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3243-4711
  3. Lone Simonsen

    1. Division of International Epidemiology and Population Studies, Fogarty International Center, National Institutes of Health, Bethesda, United States
    2. PandemiX Center, Dept of Science & Environment, Roskilde University, Roskilde, Denmark
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1535-8526

Funding

Carlsberg Foundation (CF20-0046)

  • Lone Simonsen

Danish National Research Foundation (DNRF170)

  • Chelsea L Hansen
  • Lone Simonsen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This paper is dedicated to our colleague Robert J Taylor who succumbed to cancer in 2022 and who wanted to know if a cancer diagnosis was a COVID-19 mortality risk factor. LS acknowledges funding from the Carlsberg Foundation, grant number CF20-0046. LS and CLH acknowledge funding from Danish National Research Foundation (grant number DNRF170) for the PandemiX Center of Excellence. CLH has received contract-based hourly consulting fees from Sanofi outside of the submitted work.

Version history

  1. Sent for peer review:
  2. Preprint posted:
  3. Reviewed Preprint version 1:
  4. Reviewed Preprint version 2:
  5. Version of Record published:

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.93758. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 700
    views
  • 52
    downloads
  • 1
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Chelsea L Hansen
  2. Cécile Viboud
  3. Lone Simonsen
(2024)
Disentangling the relationship between cancer mortality and COVID-19 in the US
eLife 13:RP93758.
https://doi.org/10.7554/eLife.93758.3

Share this article

https://doi.org/10.7554/eLife.93758

Further reading

    1. Cancer Biology
    2. Evolutionary Biology
    Susanne Tilk, Judith Frydman ... Dmitri A Petrov
    Research Article

    In asexual populations that don’t undergo recombination, such as cancer, deleterious mutations are expected to accrue readily due to genome-wide linkage between mutations. Despite this mutational load of often thousands of deleterious mutations, many tumors thrive. How tumors survive the damaging consequences of this mutational load is not well understood. Here, we investigate the functional consequences of mutational load in 10,295 human tumors by quantifying their phenotypic response through changes in gene expression. Using a generalized linear mixed model (GLMM), we find that high mutational load tumors up-regulate proteostasis machinery related to the mitigation and prevention of protein misfolding. We replicate these expression responses in cancer cell lines and show that the viability in high mutational load cancer cells is strongly dependent on complexes that degrade and refold proteins. This indicates that the upregulation of proteostasis machinery is causally important for high mutational burden tumors and uncovers new therapeutic vulnerabilities.

    1. Cancer Biology
    2. Cell Biology
    Kourosh Hayatigolkhatmi, Chiara Soriani ... Simona Rodighiero
    Tools and Resources

    Understanding the cell cycle at the single-cell level is crucial for cellular biology and cancer research. While current methods using fluorescent markers have improved the study of adherent cells, non-adherent cells remain challenging. In this study, we addressed this gap by combining a specialized surface to enhance cell attachment, the FUCCI(CA)2 sensor, an automated image analysis pipeline, and a custom machine learning algorithm. This approach enabled precise measurement of cell cycle phase durations in non-adherent cells. This method was validated in acute myeloid leukemia cell lines NB4 and Kasumi-1, which have unique cell cycle characteristics, and we tested the impact of cell cycle-modulating drugs on NB4 cells. Our cell cycle analysis system, which is also compatible with adherent cells, is fully automated and freely available, providing detailed insights from hundreds of cells under various conditions. This report presents a valuable tool for advancing cancer research and drug development by enabling comprehensive, automated cell cycle analysis in both adherent and non-adherent cells.