Evaluating distributional regression strategies for modelling self-reported sexual age-mixing
Abstract
The age dynamics of sexual partnership formation determine patterns of sexually transmitted disease transmission and have long been a focus of researchers studying human immunodeficiency virus. Data on self-reported sexual partner age distributions are available from a variety of sources. We sought to explore statistical models that accurately predict the distribution of sexual partner ages over age and sex. We identified which probability distributions and outcome specifications best captured variation in partner age and quantified the benefits of modelling these data using distributional regression. We found that distributional regression with a sinh-arcsinh distribution replicated observed partner age distributions most accurately across three geographically diverse data sets. This framework can be extended with well-known hierarchical modelling tools and can help improve estimates of sexual age-mixing dynamics.
Data availability
Data from the Demographic and Health Surveys are available from the DHS Program website (https://dhsprogram.com/data/available-datasets.cfm). Data from the Africa Centre Demographic Information System are available on request from the AHRI website (https://data.ahri.org/index.php/home). Data from the Manicaland study were used with permission from the study investigators (http://www.manicalandhivproject.org/manicaland-data.html).
-
AHRI.PIP.Men's General Health.All.Release 2020-07AHRI Data Repository, doi: 10.23664/AHRI.PIP.RD04-99.MGH.ALL.202007.
-
AHRI.PIP.Women's General Health.All.Release 2020-07AHRI Data Repository, doi: 10.23664/AHRI.PIP.RD03-99.WGH.ALL.202007.
-
Haiti Enquête Mortalité, Morbidité et Utilisation des Services 2016-2017 - EMMUS-VI [Dataset]The DHS Program, Haiti: Standard DHS, 2016-17.
Article and author information
Author details
Funding
Bill and Melinda Gates Foundation (OPP1190661,OPP1164897)
- Kathryn A Risher
- Simon Gregson
- Jeff Eaton
Medical Research Council (MR/R015600/1)
- Simon Gregson
- Jeff Eaton
National Institute of Allergy and Infectious Diseases (R01AI136664)
- Jeff Eaton
Engineering and Physical Sciences Research Council (EP/V002910/1)
- Seth Flaxman
Imperial College London (President's PhD Scholarship)
- Timothy M Wolock
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Human subjects: We conducted secondary analysis of previously collected anonymised data in compliance with each data producer's use requirements. Procedures and questionnaires for standard DHS surveys have been reviewed and approved by the ICF International Institutional Review Board (IRB). The Manicaland study was approved by the Medical Research Council of Zimbabwe and the Imperial College Research Ethics Committee. The Africa Centre Demographic Information System PIP surveillance study was approved by Biomedical Research Ethics Committee, University of KwaZulu-Natal, South Africa (BE290/16).
Copyright
© 2021, Wolock et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 553
- views
-
- 49
- downloads
-
- 0
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Epidemiology and Global Health
- Evolutionary Biology
Several coronaviruses infect humans, with three, including the SARS-CoV2, causing diseases. While coronaviruses are especially prone to induce pandemics, we know little about their evolutionary history, host-to-host transmissions, and biogeography. One of the difficulties lies in dating the origination of the family, a particularly challenging task for RNA viruses in general. Previous cophylogenetic tests of virus-host associations, including in the Coronaviridae family, have suggested a virus-host codiversification history stretching many millions of years. Here, we establish a framework for robustly testing scenarios of ancient origination and codiversification versus recent origination and diversification by host switches. Applied to coronaviruses and their mammalian hosts, our results support a scenario of recent origination of coronaviruses in bats and diversification by host switches, with preferential host switches within mammalian orders. Hotspots of coronavirus diversity, concentrated in East Asia and Europe, are consistent with this scenario of relatively recent origination and localized host switches. Spillovers from bats to other species are rare, but have the highest probability to be towards humans than to any other mammal species, implicating humans as the evolutionary intermediate host. The high host-switching rates within orders, as well as between humans, domesticated mammals, and non-flying wild mammals, indicates the potential for rapid additional spreading of coronaviruses across the world. Our results suggest that the evolutionary history of extant mammalian coronaviruses is recent, and that cases of long-term virus–host codiversification have been largely over-estimated.
-
- Cancer Biology
- Epidemiology and Global Health
Cancer is considered a risk factor for COVID-19 mortality, yet several countries have reported that deaths with a primary code of cancer remained within historic levels during the COVID-19 pandemic. Here, we further elucidate the relationship between cancer mortality and COVID-19 on a population level in the US. We compared pandemic-related mortality patterns from underlying and multiple cause (MC) death data for six types of cancer, diabetes, and Alzheimer’s. Any pandemic-related changes in coding practices should be eliminated by study of MC data. Nationally in 2020, MC cancer mortality rose by only 3% over a pre-pandemic baseline, corresponding to ~13,600 excess deaths. Mortality elevation was measurably higher for less deadly cancers (breast, colorectal, and hematological, 2–7%) than cancers with a poor survival rate (lung and pancreatic, 0–1%). In comparison, there was substantial elevation in MC deaths from diabetes (37%) and Alzheimer’s (19%). To understand these differences, we simulated the expected excess mortality for each condition using COVID-19 attack rates, life expectancy, population size, and mean age of individuals living with each condition. We find that the observed mortality differences are primarily explained by differences in life expectancy, with the risk of death from deadly cancers outcompeting the risk of death from COVID-19.