Introduction

The dominant risk factors for COVID-19 mortality have consistently been shown to be advanced age, male gender and certain chronic diseases such as diabetes, obesity and heart disease (Chavez-MacGregor et al., 2022; Rüthrich et al., 2021; Williamson et al., 2020). Cancer has also been identified as a high-risk condition based on case-control and cohort studies, although these studies have provided conflicting results. In a large cohort study of ∼500,000 COVID-19 inpatients, only cancer patients under recent treatment were at increased risk of COVID-19 related deaths (OR=1.7) relative to non-cancer patients (Chavez-MacGregor et al., 2022). Conversely, a smaller European study of 3,000 COVID-19 inpatients found that cancer was not a risk factor (Rüthrich et al., 2021), as did an international, multicenter study of 4,000 confirmed COVID-19 inpatients (Raad et al., 2023). More recently a meta-analysis of 35 studies from Europe, North America, and Asia found a 2-fold increased risk of COVID-19 mortality among cancer patients (Di Felice et al., 2022). Similarly, a large analysis from the UK found that the risk of COVID-19 mortality for cancer patients had declined over the course of the pandemic but remained 2.5 times higher than for non-cancer patients into 2022 (Starkey et al., 2023). Taken together, such observational studies provide a mixed picture of cancer as a COVID-19 mortality risk factor, with several studies reporting that controlling for other important factors such as age is a challenge. Further, cancer is often considered as a single disease category despite the diversity of conditions and patients represented.

Further evidence for the relationship between cancer and COVID-19 comes from population-level analysis of vital statistics. A recent US study showed no elevation in cancer deaths concomitant with COVID-19 waves, in stark contrast to mortality from other chronic diseases (W.-E. Lee et al., 2023). Several other countries, including Sweden, Italy, Latvia, Brazil, England and Wales also observed stable or decreasing cancer mortality during the first year of the pandemic (Alicandro et al., 2023; Fernandes et al., 2021; Gobiņa et al., 2022; Grande et al., 2022; Kontopantelis et al., 2022; Lundberg et al., 2023). Further, a study of 240,000 cancer patients in Belgium found a 33% rise in mortality in April 2020, but concluded that this was no different from the excess mortality observed in the general population (Silversmit et al., 2021). These findings raise the question of the true relationship between cancer and COVID-19.

The relationship between these two diseases could operate via multiple biological mechanisms, where immunosuppression in cancer patients could increase susceptibility to SARS-CoV-2 infection and/or risk of severe clinical outcome upon infection. Conversely, immunosuppression could be seen as a protective factor in the face of a severe respiratory infection that over stimulates the immune system – the immune incompetence rescue hypothesis. (Reichert 2004). This hypothesis was put forward to explain the lack of elevation in underlying cancer mortality during the 1968 influenza pandemic and severe influenza epidemics, a departure from patterns seen for other high-risk conditions such as heart disease and diabetes (Reichert 2004). A further mechanism that could affect the observed relationship between cancer deaths and COVID-19 is changing guidelines for establishing the primary cause of death. Coding guidelines evolved throughout the pandemic as testing for SARS-CoV-2 infection became more widespread, which presumably affected vital statistics studies.

To further elucidate the relationship between cancer mortality and COVID-19 on a population level, we analyzed US vital statistics in detail to understand the potential role of coding changes during the pandemic and explored putative differences in mortality patterns between different types of cancer. The US provides a particularly useful case study as the timing of COVID-19 waves varied considerably between states, so that elevations in cancer deaths, should they exist, should also be heterogeneous. For context, we also assessed mortality patterns for other chronic conditions such as diabetes, ischemic heart disease (IHD), kidney disease, and Alzheimer’s, for which the association with COVID-19 is less debated.

Results

Establishing patterns and timing of COVID-19 related deaths

We obtained individual ICD-10 coded death certificate data from the US for the period January 1, 2014, to December 31, 2020. We compiled time series by week, state, and cause of death, for underlying cause (UC) and for multiple-cause (MC, any mention on death certificate) mortality. We considered 10 causes of death, including diabetes, Alzheimer’s disease, ischemic heart disease (IHD), kidney disease, and 6 types of cancer (all-cause cancer, colorectal, breast, pancreatic, lung, and hematological; see Table 1 and Appendix 1 - Table 1 for a list of disease codes). We chose these types of cancer to illustrate conditions for which the 5-year survival rate is low (13% and 25%, respectively, for pancreatic and lung cancers) and high (65% and 91%, respectively, for colorectal and breast cancers) (National Cancer Institute, n.d.).

Each diagnosis group and its corresponding ICD-10 codes, number of underlying deaths, mean age in years at time of death, the percentage of deaths occurring at home, and the percentage of deaths occurring in nursing homes for 2019 and 2020.

Hematological cancer (67% 5-year survival rate) was included because it has been singled out as a risk factor in several previous studies (Chavez-MacGregor et al., 2022; X. Han et al., 2022; Rüthrich et al., 2021; Williamson et al., 2020).To compare mortality patterns with the timing of COVID-19 pandemic waves, we accessed national and state counts of reported COVID-19 cases from the Centers for Disease Control and Prevention (CDC)(Centers for Disease Control and Prevention, 2022).

In national data, time series of COVID-19-coded death certificates (both UC and MC) tracked with the temporal patterns of laboratory-confirmed COVID-19 cases (Figure 1), revealing three distinct COVID-19 waves: a spring wave peaking on April 12, 2020, a smaller summer wave peaking on July 26, 2020, and a large winter wave that had not yet peaked by the end of the study in December 2020. This correspondence between COVID-19 case and death activity represents a “signature” mortality pattern of COVID-19.

Weekly counts of death certificates listing COVID-19 as either the underlying or a multiple cause. When included on a death certificate, COVID-19 was most often listed as the underlying cause of death rather than a contributing cause. National-level data reveal three distinct waves: Wave 1 (spring, March 1 - June 27, 2020), Wave 2 (summer, June 28 - October 3, 2020), and Wave 3 (winter, October 4 - December 6, 2020, incomplete). Vertical dashed lines represent the peak of each wave, dotted lines represent the number of reported cases (y-axis on the right). New York experienced its first large COVID-19 wave in Wave 1, while Texas had its first large wave in Wave 2 and California did not experience a large wave until Wave 3 which had not yet peaked at the end of 2020.

In state-level data, different states experienced variable timing, intensity and number of COVID-19 waves during 2020. To focus on periods with substantial COVID-19 activity and explore the association with cancer, we identified three large US states with unique, well-defined waves (Figure 1). New York (NY) state experienced a large, early wave in March-May 2020, based on recorded COVID-19 cases and deaths and high seroprevalence of SARS-CoV-2 antibodies in New York City in this period (over 20% (Stadlbauer et al., 2021)). Meanwhile, California (CA) experienced a large COVID-19 wave at the end of the year and had only little activity during the spring and summer. Finally, Texas (TX) had two large waves; one during late summer, followed by one in winter 2020.

National patterns in excess mortality from cancer

Similar to other influenza and COVID-19 population-level mortality studies (W.-E. Lee et al., 2023), we established a weekly baseline model for expected mortality in the absence of pandemic activity by modeling time trends and seasonality in pre-pandemic data and letting the model run forward during the pandemic (see Methods). Each cause of death (UC and MC) and geography was modeled separately. We then computed excess mortality as the difference between observed deaths and the model-predicted baseline. We summed weekly estimates to calculate excess mortality for the full pandemic period and during each of the 3 waves (see Methods). In addition to these absolute effects of the pandemic on mortality, we also calculated the relative effects by dividing excess mortality by baseline mortality (see Methods).

Nationally, we found a drop in UC cancer deaths during spring 2020 (Figure 2, panel a; Table 2), although the drop was not statistically significant. A similar non-significant decline was also seen for specific cancer types (Figure 2, panels b-c; Appendix 1 - Figure 1, panels a;f-j). We also saw that pre-pandemic mortality trends for each cancer type continued unabated during the first pandemic year. We reasoned that the drop in UC cancer deaths seen at the start of the pandemic could be evidence of a modest harvesting effect or alternatively could be due to changes in coding practices. If a death occurred in a cancer patient with COVID-19, the death could be coded with COVID-19 as the underlying cause of death and could explain the observed drop. We turned to MC mortality to resolve this question.

National-level weekly observed and estimated baseline mortality for each diagnosis group as both the underlying cause or anywhere on the death certificate (multiple cause) from 2014 to 2020. Baselines during the pandemic are projected based on the previous years of data.

The estimated number of excess deaths and the percentage over baseline for each diagnosis group when listed as both the underlying cause or anywhere on the death certificate (multiple cause). Estimates for the national-level data are provided for the full pandemic period and for each state based on when the first large wave was experienced.

Time series of MC cancer mortality (any mention of any cancer code) showed a significant increase in all three waves (Figure 2, panel a; Appendix 1 - Table 2). A similar pattern was seen in MC time series for colorectal (Figure 2, panel h), breast (Appendix 1 - Figure 1, panel i), and hematological cancer (Appendix 1 - Figure 1, panel j). However, the total excess mortality was modest with 12,000 excess cancer deaths in 2020, representing a statistically significant 2% elevation over baseline (Table 2). The largest relative increase in MC mortality was observed in hematological cancer at 5% (statistically significant, 3100 excess deaths). No excess in MC mortality was seen for the two deadliest cancers, pancreatic cancer (Figure 2, panel f) and lung cancer (Appendix 1 - Figure 1, panel g).

National patterns in deaths due to other chronic conditions

We considered diabetes and Alzheimer’s as “positive controls” as they are also considered COVID-19 risk factors and can illustrate associations between excess mortality from chronic conditions and COVID-19 on a population level. Diabetes provides a particularly useful comparator for cancer as the mean age at death is similar (∼72 years, Table 1) and because few individuals live in a nursing home (Appendix 1 - Supplemental Methods). Mortality time series from UC and MC diabetes and Alzheimer’s were highly correlated with COVID-19 activity, with statistically significant mortality elevation synchronous with pandemic wave activity (Figure 2 b-c; Appendix 1 - Figures 2-5). For diabetes, we measured an excess of 11,400 and 85,700 deaths (UC and MC, respectively), corresponding to an elevation of 17% and 39% over baseline level mortality (Table 2). For Alzheimer’s, we estimated 18,500 and 32,200 excess deaths, corresponding to 21% and 31% elevation over baseline, respectively. Pandemic-related excess mortality was also seen for IHD and kidney disease (see supplement for estimates, Appendix 1 - Table 2).

State-level patterns in excess mortality

Similar to cancer patterns in national level data, none of the studied states had notable increases in UC cancer mortality, while there was a modest, non-significant increase in MC cancer mortality (Figures 3-5; Appendix 1 - Figures 6-8). The largest mortality increase was seen in NY during the spring wave, with an 8% rise in MC cancer mortality above the model baseline (Table 2; Appendix 1 - Table 3). The magnitude of the increase seen during the spring wave varied by cancer type, with minimal increases seen in pancreatic and lung cancers (≤2%) and higher increases in colorectal, hematological, and breast cancers (8, 13, and 15% respectively). For comparison, there was a statistically significant rise in Alzheimer’s and diabetes deaths during this wave by 55% and 126%.

The same as figure 1, but for New York. New York experienced its first large wave of COVID-19 in spring 2020 (Wave 1).

The same as figure 1, but for Texas. Texas experienced its first large wave of COVID-19 in the summer of 2020 (Wave 2).

The same as figure 1, but for California. California did not experience a large wave of COVID-19 until the winter of 2020-2021 (Wave 3), only the first half of which is captured here.

Projections of COVID-19-related excess mortality patterns for different cancers and chronic conditions in the US, under different hypotheses for the association between the condition and COVID-19.

Projections are provided for the null hypothesis of no biological interaction between the condition and COVID-19; these projection are solely driven by the size and age distribution of the population living with each condition (where age determines the infection-fatality ratio from COVID-19), and the baseline risk of death from the condition over a similar time period (March to December 2019, 10 months). Additional projections are provided under alternative hypotheses, where each condition is associated with a relative risk (RR) of 2 for COVID-19 related death (infection-fatality ratio multiplied by 2).

In CA and TX, mortality fluctuations were less pronounced than in NY, coinciding with less intense COVID-19 waves, and this was seen across all conditions. MC excess mortality estimates remained within +/-6% of baseline levels for cancers, irrespective of the type of cancer and pandemic wave, except for a 12% elevation in hematological cancer (MC) during the summer wave in Texas. None of these elevations were statistically significant. In comparison, there was significant excess mortality elevation for both Alzheimer’s and diabetes deaths (range, 25-49% in the CA winter wave, and 65-76% in the TX summer wave, Appendix 1-Tables 4-5).

Mortality projections under the null hypothesis that cancer in and of itself is not a risk factor for COVID-19 mortality

Two main factors could drive cancer mortality patterns during COVID-19, namely the age of the population living with cancer (since age is such a pronounced risk factor for COVID-19), and the life expectancy under cancer diagnosis. These factors would operate irrespective of the true biological relationship between SARS-CoV-2 infection, severity, and cancer.

To test the impact of these factors on observed excess mortality patterns and assess whether these factors alone could explain differences in excess mortality between chronic conditions, we designed a simple demographic model of COVID-19 mortality for individuals with chronic conditions. The model projected excess mortality during the pandemic under the null hypothesis that the chronic condition was not in and of itself a risk factor for COVID-19 mortality, with only the demography of the population living with the disease (namely, age, size and baseline risk of death) affecting excess mortality. In the demographic model, we first estimated the number of expected COVID-19 infections among persons with a certain condition, by multiplying the estimated number of US individuals living with the condition by the reported SARS-CoV-2 seroprevalence at the end of our study period (December 2020). We focused on seroprevalence among individuals ≥65 years, the most relevant age group for the conditions we considered. We then multiplied the estimated number of SARS-CoV-2 infections by an age-adjusted infection-fatality ratio (IFR) for SARS-CoV-2 (COVID-19 Forecasting Team, 2022). This gave an estimate of COVID-19-related deaths, or excess deaths, for a given condition. We divided our excess death estimate by the total deaths for that condition in 2019 to estimate a percent elevation over baseline (see Methods). We repeated this analysis for each cancer type, diabetes, Alzheimer’s, IHD, and kidney disease. In addition to the null hypothesis, we also projected an alternative hypothesis of a biological association, assuming that a given chronic condition would raise the risk of COVID-19 mortality (via the infection fatality ratio) by a factor 2. We compared these modeled expectations for the null and alternative hypotheses with the observed excess mortality in 2020, focusing on MC as the outcome (Table 2).

Under the null hypothesis we projected a 7% elevation in all cancer deaths over the 2019 baseline (Table 3). For hematological cancers and particularly deadly cancers such as pancreatic and lung, we projected only a 1-2% elevation in mortality, in part driven by the high competing risk of death from these cancers (short life expectancy) and the small size of the population-at-risk. For colorectal and breast cancers, we projected a 6% and 14% elevation in mortality, in part driven by the lower risk of death from these cancers (longer life expectancy). Under the alternative hypothesis that cancer doubled the COVID-19 infection fatality rate (IFR), we projected a 13% elevation in total cancer mortality, 2% in pancreatic- and 28% in breast cancer. In empirical national MC mortality data, we observed a 0-3% elevation over baseline for all the non-hematological cancers and 5% for hematological cancers, more consistent with the null hypothesis. We note, however, that for the large spring wave in NY state the rise in cancers was closer to that projected under the assumption of a relative risk of 2.

We repeated this analysis for diabetes, Alzheimer’s, IHD, and kidney disease mortality (Table 3; Appendix 1 - Table 6). For diabetes we projected a 28% elevation over baseline based on the age distribution and substantial size of the population-at-risk alone. In fact, we observed a 39% elevation over baseline in national US data. For Alzheimer’s we projected a 46% increase over baseline, while we found a 31% increase in national US mortality data. Similarly, for IHD and kidney disease, the magnitude of the excess mortality rise projected under the demographic model was higher than for cancer and consistent with observations (Appendix 1-Table 6). These projections support the idea that demography alone (age, size, and baseline mortality of the population living with each of these conditions) can explain much of the differences in absolute and relative mortality elevations seen during the pandemic across conditions like cancer, diabetes, and Alzheimer.

Discussion

Cancer is generally thought of as a risk factor for severe COVID-19 outcomes, yet observational studies have produced conflicting evidence. With recent availability of more detailed US vital statistics data, we used statistical time series approaches to generate excess mortality estimates for multiple cause of death data, different types of cancer, and several geographic locations. We accounted for potential changes in coding practices during the pandemic, for instance capturing a COVID-19 patient with cancer whose death may have been coded as a primary COVID-19 death and not a cancer death. Based on multiple cause of death data, we estimated 12,000 national COVID-19-related excess cancer deaths, which aligns well with reporting on death certificate data, where 13,400 deaths are ascribed to COVID-19 in cancer patients (Appendix 1 - Figure 9). Yet these deaths only represent a 2% elevation over the expected baseline cancer mortality. Percent mortality elevation was measurably higher for less deadly cancers (breast and colorectal) than cancers with a poor 5-year survival (lung and pancreatic). Consistent with other studies (Chavez-MacGregor et al., 2022; S. Han et al., 2022; Rüthrich et al., 2021; Williamson et al., 2020), we found that the largest mortality increase for specific cancer types was seen in hematological cancers with a 5% elevation over baseline.

In contrast to cancer, we observed substantial COVID-19-related excess mortality for diabetes and Alzheimer’s, temporally consistent with the three-wave “signature” pattern observed in reported COVID-19 cases and deaths. To investigate whether demographic differences in underlying patient populations (age distribution, population size, and baseline risk of death due to underlying condition) could explain differences in excess mortality during the pandemic, we ran a simple demographic model for each condition – first assuming the condition in and of itself was not a risk factor for COVID-19-related mortality (null hypothesis). The results of these projections were consistent with observed excess mortality patterns; specifically, we did not expect to see large increases in cancer deaths compared to these other chronic conditions.

These projections also illustrate the importance of competing risks, where the risk of cancer death predominates over the risk of COVID-19 death. This is exacerbated for cancers with the lowest survival; for instance, for pancreatic cancer, under the null hypothesis we would expect a <1% risk of mortality from COVID-19 in 2020 (assuming a 9% attack rate and 2.6% IFR, Appendix 1 - Table 7). In contrast, the 2019 baseline risk of death for pancreatic cancer itself is 43.5% (ratio of deaths to population-at-risk = 1:2.3, Table 3). Even if pancreatic cancer had in fact doubled the risk of dying of COVID-19 (IFR = 5.2), we would not expect to see more than a 1% excess mortality elevation during the pandemic (Table 3), due to the high baseline level mortality associated with this disease. On the other hand, conditions with a lower baseline level mortality, such as diabetes (<1% baseline risk of death), are more sensitive to COVID-19 driven elevations in mortality.

Our study rules out the immune incompetence rescue hypothesis that was raised in a 2004 paper on excess mortality patterns during influenza seasons (Reichert et al 2004). Similarly, the possibility that infectious disease mortality risk is modulated by immune competence has been put forward to explain the extreme mortality in young healthy adults in the 1918 pandemic (Short et al., 2018). In the 2004 study, cancer deaths did not increase during the 1968 influenza pandemic as it did for other risk conditions, leading the authors to propose that immunosuppressive cancer treatment could mitigate an aberrant immune response to pandemic influenza infection. However, observational studies have consistently found the opposite to be the case for COVID-19 infection in patients with hematological cancers. These patients have twice the risk of dying compared to patients without cancer, likely due to the immunosuppression associated with their malignancy and treatment (X. Han et al., 2022; Starkey et al., 2023; Williamson et al., 2020). Under the immune incompetence rescue hypothesis, one would have expected the opposite – that hematological cancers would have lowest excess mortality of all cancers. Our analysis of empirical vital statistics reveals instead that hematologic cancers were the most impacted by the pandemic, relative to other types of cancer, with a percent elevation over baseline most pronounced in states that were hit intensely like New York.

Nationally, the observed excess mortality for non-hematological cancers was lower than that expected under our demographic model, even under the null hypothesis of no biological association between non-hematologic cancers and COVID-19. The null hypothesis may still be valid as our analysis ignores any behavioral effects associated with the pandemic. It is conceivable that cancer patients may have shielded themselves from COVID-19 more than the average person or even other persons with chronic diseases in 2020. Our projections assume an average risk of infection for a typical individual over 65 years as there is no serologic data for specific clinical population subgroups (of any age). If shielding was high among cancer patients, our projections of cancer excess mortality during the pandemic would be inflated, potentially explaining the disconnect with observations. Retrospective serologic analysis of banked sera from the first year of the pandemic, broken down by underlying comorbidities, may shed light on whether infection risk may have varied by chronic condition.

State-level mortality patterns can potentially provide indirect insights on the question of shielding from exposure to SARS-CoV-2. Because NY experienced the earliest and most intense COVID-19 wave of the US, with 25% of the population infected in Spring 2020 (Centers for Disease Control and Prevention., 2023), and because social distancing did not come into effect until March 2020, shielding would have had a more limited impact there than in other states. Thus, a biological relationship between cancer and COVID-19 would have been most dramatic in NY in spring 2020. Indeed, cancer excess mortality was exacerbated in NY, including an 8-15% increase in colorectal and breast cancer mortality. Yet these increases are still aligned with the projections from our demographic model under the null hypothesis. The absence of excess mortality in pancreatic and lung cancer in NY (0% and 1% over baseline) are, as discussed above, still consistent with what would be expected under a high competing risk situation.

Most vital statistics studies focused on the COVID-19 pandemic have relied on underlying cause-specific deaths, which are prone to changes in coding practices. Our initial hypothesis going into this work was that coding changes associated with a better recognition of the impact of SARS-CoV-2 led to an underestimation of excess mortality from cancer, affecting our perception of the relationship between cancer and COVID-19. We certainly found an effect of coding changes, where for instance a drop in excess mortality in underlying cancer deaths turned into an increase in any-listed cancer deaths, particularly in the first COVID-19 pandemic wave. The impact of coding changes was also seen in mortality from other chronic conditions but was particularly important for cancer. Yet both the absolute and relative excess mortality elevation remained modest for cancer, even after adjustment for coding changes, leading us to consider additional mechanisms such as the competing risk hypothesis.

Our study is subject to limitations. Given uncertainty in SARS-CoV-2 attack-rates and the age distribution and size of the population-at-risk for all studied conditions, our demographic model projections are not an exact tool to titrate excess mortality nor the relative risk associated with each condition. Our model merely serves as an illustration of the role of demography and competing risks. Further, we did not study the potential long-term consequences of the pandemic on cancer care, which includes avoidance of the health care system for diagnosis or treatment. We did not see any delayed pandemic effect on mortality from pancreatic cancer, which may have manifested in 2020 given the very low survival rate of this cancer (Lemanska et al., 2023), but we cannot rule out longer-term effects on breast or colorectal cancers that would not be seen until 2021 or later (Doan et al., 2023; Han et al., 2023; Haribhai et al., 2023; R. Lee et al., 2023; Nascimento de Lima et al., 2023; Nickson et al., 2023; Nonboe et al., 2023; Tope et al., 2023). Additional years of data will be important to evaluate such effects. Additional years of data will also be important for assessing the impact of vaccination on the relationship between cancer and COVID-19; there is evidence that vaccines may be less immunogenic in patients with cancer compared to those without (Seneviratne et al., 2022). Another limitation of our study is the reliance on mortality as an outcome, while it may be important to consider the risk of COVID-19-related hospitalization and morbidity, and Long COVID in cancer patients. A small US study reported that 60% of cancer patients suffered Long COVID symptoms (Dagher et al., 2023). Future analyses using hospitalization data and electronic medical records may provide additional insights on how different cancer stages or other comorbidities may contribute to increased risk of severe COVID-19 outcomes. Lastly, a few methodological limitations are worth raising. Though it was important to assess excess mortality in state level data because of asynchrony in pandemic waves, confidence intervals in state-level estimates were large, particularly for specific types of cancers, affecting significance levels. Lastly, our study is a time-trend analysis and – similar to cohort and case-control studies – correlation does not necessarily imply causation. However, the intensity and brevity of COVID-19 pandemic waves in space and time lends support to our analyses.

Conclusion

Our detailed excess mortality study considered six cancer types and found that there is at most a modest elevation in cancer mortality during the COVID-19 pandemic in the US. Our results demonstrate the importance of considering multiple-causes-of-death records to accurately reflect changes in coding practices associated with the emergence of a new pathogen. In contrast to earlier studies, we propose that lack of excess cancer mortality during the COVID-19 pandemic reflects the competing mortality risk from cancer (especially for pancreatic and lung cancers) itself rather than protection conferred from immunosuppression. We note the more pronounced elevation in mortality from hematological cancers during the pandemic, compared to other cancers, which aligns with a particular group of cancer patients singled out in several cohort studies. Future research on the relationship between COVID-19 and cancer should concentrate on different outcomes, such as excess hospitalizations, Long COVID, changes in screening practices during COVID-19, and longer-term patterns in cancer mortality.

Materials and Methods

Data sources

US National vital statistics

We obtained individual ICD-10 coded death certificate data with exact date of death from the United States for the period January 1, 2014, to December 31, 2020. Each death certificate has one underlying cause (UC) of death, defined as the disease or injury that initiated the train of events leading directly to death, and up to twenty causes of death in total, referred to here as multiple cause mortality (MC). We considered 10 conditions, including diabetes, Alzheimer’s disease, ischemic heart disease (IHD), kidney disease, and 6 types of cancer (all cancer, colorectal, breast, pancreatic, lung, and hematological; see Table 1 for a list of disease codes). We chose these types of cancer to illustrate conditions for which the 5-year survival rate is low (13% and 25%, respectively, for pancreatic and lung cancers) and high (65% and 91%, respectively, for colorectal and breast cancers) (National Cancer Institute, n.d.). Hematological cancer (67% 5-year survival) was included because it was singled out as a risk factor by previous studies. We compiled time series by week, state, and cause of death, separately for underlying and multiple cause mortality.

Other data sources

To compare vital statistics patterns with COVID-19 surveillance data, we accessed national and state counts of laboratory-confirmed COVID-19 cases in 2020, from the CDC (Centers for Disease Control and Prevention, 2022).

To clarify the expected role of COVID-19 on excess mortality, we compiled data on the proportion of the population with serologic evidence of SARS-CoV-2 infection by the end of 2020 from the CDC dashboard (Centers for Disease Control and Prevention, 2023). We further compiled data on estimated age-specific infection-fatality ratios from COVID-19, provided by single year of age (COVID-19 Forecasting Team, 2022).

Statistical approach

Weekly excess mortality models

Similar to other influenza- and COVID-19 excess mortality studies (W.-E. Lee et al., 2023), we established a predicted baseline of expected mortality for each time series, and computed the excess mortality as the excess in observed deaths over this baseline. To establish baselines for each disease nationally and in each state, we applied negative binomial regression models to weekly mortality counts for each cause of death, smoothed with a 5-week moving average and rounded to the nearest integer. Models included harmonic terms for seasonality, time trends, and an offset for population size, following:

Weekly_mortality = t + t2 + cos(2πt/52.17) + sin(2πt/52.17) + offset(log(population)), where t represents week.

We fitted national and state-level models for each mortality outcome from January 19, 2014, to March 1, 2020, and projected the baseline forward until December 6, 2020, the last complete week of smoothed mortality data.

Using COVID-19 coded death certificates from March 1, 2020, to December 6, 2020, we established the timing of each pandemic wave from trough to trough. We found that nationally, the first wave occurred from March 1, 2020, to June 27, 2020; the second wave from June 28, 2020 to October 3, 2020 and the third from October 4, 2020 to December 6, 2020 (the 3rd wave was not completed by the last week of available smoothed data on December 6, 2020). For NY, the pandemic pattern was characterized by an intense first wave in Spring 2020, while TX had its major wave in summer 2020 and CA in late 2020. Comparison of mortality patterns from these three states provides an opportunity to separate the effect of SARS-CoV-2 infection with that of behavioral changes later in the pandemic. For instance, the effects of healthcare avoidance would predominate in CA or TX in Spring 2020, as there was little SARS-CoV-2 activity but much media attention on COVID-19, with cancer patients potentially avoiding medical care out of fear of getting infected. In contrast, risk of infection would dominate in NY in Spring 2020, and behavioral factors may only play a role as SARS-CoV-2 awareness increased and the wave was brought under control by social distancing.

We estimated weekly excess mortality by subtracting the predicted baseline from the observed mortality. We summed weekly estimates to calculate excess mortality for the full pandemic period and for each of the 3 waves within the first year of the pandemic. In addition to estimating the absolute effects of the pandemic on mortality, we also calculated relative effects by dividing excess deaths in each diagnosis group by the model baseline. Confidence intervals on excess mortality estimates were calculated by resampling the estimated model coefficients 10,000 times using a multivariate normal distribution and accounting for negative binomial errors in weekly mortality counts.

We used Pearson correlation to test synchronicity patterns in weekly excess mortality from different cancers and chronic conditions to underlying COVID-19 deaths. Correlation analysis assumes a direct and immediate effect of COVID-19 on cancer mortality. We also investigated the possibility of delayed effects or harvesting by inspecting the time series for evidence of such effects and by comparing total excess deaths for distinct pandemic waves and the whole of 2020.

Projections of excess mortality under the null hypothesis of no specific COVID-19 mortality risk of each condition

To further test the impact of age on the association between chronic conditions and COVID-19, and clarify the additional risk due to each chronic condition, we projected the number of COVID-19 deaths under the null hypothesis that age alone is a risk factor, and that there is no particular interaction between the condition and SARS-CoV-2 infection. Excess mortality projections were then compared with observed excess mortality. We only used MC deaths for this approach to account for the possibility that some individuals may suffer from multiple conditions. For example, an estimated 11.5% of US adults with type 2 diabetes also have a history of cancer (Yeh et al., 2018).

We first calculated the number of expected COVID-19 infections among persons living with a certain chronic condition, by multiplying the estimated number of individuals living with the condition by the reported SARS-CoV-2 seroprevalence among individuals ≥65 years at the end of 2020 (we interpolated between the CDC surveys conducted in mid-December 2020 and the next one available in February 2021 (Centers for Disease Control and Prevention, 2023). We then estimated an age-adjusted COVID-19 IFR based on the estimated age distribution of individuals living with the condition and single-year age fatality rates (COVID-19 Forecasting Team, 2022) (Appendix 1-Table 7). We multiplied this age-adjusted infection-fatality ratio by the estimated number of infections to arrive at the projected number of COVID-19-related excess deaths for a particular condition during 2020.

To obtain a relative metric of expected COVID-19 burden, we divided projected COVID-19 excess deaths by total deaths in each diagnosis group in the 2019 baseline period (March to December 2019), resulting in an expected percentage elevation over baseline in 2020. We compared this null expectation to the observed percentage elevation over baseline from our excess mortality models. We also generated the expected number of excess deaths under alternative hypotheses where each condition is associated with a 2-fold increased risk of COVID-19 related death given infection (i.e., the baseline age-adjusted infection fatality ratio used in the null hypothesis was increased 2-fold). We also provide projections for a RR of 5 in the Appendix (Appendix 1 - Table 6).

Acknowledgements

This paper is dedicated to our colleague Robert J Taylor who succumbed to cancer in 2022 and who wanted to know if a cancer diagnosis was a COVID-19 mortality risk factor.

Additional Information

Funding

LS acknowledges funding from the Carlsberg Foundation, grant number CF20-0046. LS and CLH acknowledge funding from Danish National Research Foundation (DNRF) for PandemiX Center of Excellence. CLH has received contract-based hourly consulting fees from Sanofi outside of the submitted work.

Author contributions

Chelsea Hansen, Data curation, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing; Cécile Viboud, Data curation, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing; Lone Simonsen, Conceptualization, Data curation, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing

Data availability

Individual-level mortality data were obtained from the National Center for Healthcare Statistics. These data are not publicly available due to privacy concerns, but descriptive characteristics have been summarized in Table 1 and Appendix - Table 1. The excess mortality models in this paper use mortality data aggregated by week and US state. These data (with values <10 suppressed), along with the model code, have been posted to the following public GitHub repository: https://github.com/chelsea-hansen/Disentangling-the-relationship-between-cancer-mortality-and-COVID-19

Weekly, state-level data on recorded COVID-19 cases and deaths are publicly available. Data were downloaded from the following link: https://data.cdc.gov/Case-Surveillance/Weekly-United-States-COVID-19-Cases-and-Deaths-by-/pwn4-m3yp and have also been posted as a .csv file to the GitHub repository referenced above.

Disclaimer

This article represents the views of the authors and not necessarily those of the National Institutes of Health or the US government.

Appendix 1

Supplemental Methods

Characteristics of cancer, diabetes, and Alzheimer’s deaths in the pre-pandemic period.

For each chronic condition studied (cancer, diabetes, IHD, Alzheimer’s), we assessed potential changes in the characteristics of deaths during the pandemic period that are unrelated to timing but may signal an association with COVID-19. For instance, age is known to be a major risk factor for COVID-19 mortality. For each chronic condition, we computed the average age-at-death in the pre-pandemic year 2019 and compared this to the average age-at-death in 2020. The second potential confounder is living arrangement, as individuals living in nursing homes may be at increased risk of exposure (and death) to COVID-19 due to mixing, even though their underlying condition is not per se a risk factor. To test this hypothesis, we also compared the proportion of individuals in each disease group who died in nursing homes in 2019 and 2020.

And finally, to illustrate the impact of coding practices we compared ICD-10 letter categories between 2020 and 2019 for the underlying cause of death when cancer or diabetes are included on the death certificate, but are not listed as the underlying cause of death (Appendix 1 - Figure 9). For 2020, we further compared death certificates listing both COVID-19 and cancer to those listing both COVID-19 and diabetes. For all comparisons between 2019 and 2020 data are limited to March to December to isolate the pandemic period.

Supplemental tables and figures

Diagnosis groups and corresponding ICD-10 codes, number of underlying and multiple cause deaths, mean age in years at time of death, the percentage of deaths occurring at home, and the percentage of deaths occurring in nursing homes for 2019 and 2020.

Supplemental Table 2. Estimated number of excess deaths and the percentage over baseline for each diagnosis group (National). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Supplemental Table 2. Estimated number of excess deaths and the percentage over baseline for each diagnosis group (New York). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Supplemental Table 2. Estimated number of excess deaths and the percentage over baseline for each diagnosis group (Texas). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Supplemental Table 2. Estimated number of excess deaths and the percentage over baseline for each diagnosis group (California). Estimates are aggregated over all of 2020 and for each COVID-19 wave during 2020.

Projections of COVID-19-related excess mortality patterns for all cancers, ischemic heart disease, and kidney disease in the US, under different hypotheses for the association between the condition and COVID-19.

Estimated age distributions and age-adjusted infection fatality ratios for six types of cancer, diabetes, Alzheimer’s, IHD, and kidney disease.

For each condition we estimated an age-adjusted infection fatality ratio. We first determined the approximate proportion of persons living with each condition across several broad age groups. We aimed to keep age groups roughly consistent between conditions, with the exception of Alzheimer’s disease for which the entire population at risk is ≥65 years. For all-cause cancer, pancreatic, lung, colorectal, and breast we used the age distribution of newly diagnosed cases in 2019. We then took a weighted average of the age-specific infection fatality ratios, using the midpoint for each age group. For the oldest age group we used the infection fatality ratio for the average age-at-death in 2019 for that condition.

National-level weekly observed and estimated baseline mortality for each diagnosis group as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020. Baselines during the pandemic are projected based on the previous years of data.

Correlation between weekly number of COVID-19 coded deaths and excess underlying deaths for each diagnosis group (National).

Correlation between weekly number of COVID-19 coded deaths and excess multiple cause deaths for each diagnosis group (National).

Correlation between weekly number of COVID-19 coded deaths and excess underlying deaths for each diagnosis group (New York).

Correlation between weekly number of COVID-19 coded deaths and excess underlying deaths for each diagnosis group (New York).

Weekly observed and estimated baseline mortality for each diagnosis group as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020 in New York. Baselines during the pandemic are projected based on the previous years of data.

Weekly observed and estimated baseline mortality for each diagnosis group as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020 in Texas. Baselines during the pandemic are projected based on the previous years of data.

Weekly observed and estimated baseline mortality for each diagnosis group as both the underlying cause or anywhere on the death certificate (multiple cause) from 2017 to 2020 in New York. Baselines during the pandemic are projected based on the previous years of data.

Comparison of ICD-10 letter categories between 2020 and 2019 for the underlying cause of death when cancer or diabetes are included on the death certificate, but are not listed as the underlying cause of death. For both cancer and diabetes, I codes (diseases of the circulatory system) make up the majority of underlying deaths. The most notable difference between 2019 and 2020 is the increase in U codes, which includes COVID-19 (U071). In total there were 13,434 deaths ascribed to COVID-19 (UC deaths) among cancer MC deaths.

COVID-19 was included in <3% of all cancer deaths and 17% of diabetes deaths. In both cases it was listed as the UC on the majority of death certificates where it was included (81% and 97% for cancer and diabetes, respectively).