A synthetic dataset primer for the biobehavioural sciences to promote reproducibility and hypothesis-generation

  1. Daniel S Quintana  Is a corresponding author
  1. University of Oslo, Norway

Abstract

Open research data provides considerable scientific, societal, and economic benefits. However, disclosure risks can sometimes limit the sharing of open data, especially in datasets that include sensitive details or information from individuals with rare disorders. This article introduces the concept of synthetic datasets, which is an emerging method originally developed to permit the sharing of confidential census data. Synthetic datasets mimic real datasets by preserving their statistical properties and the relationships between variables. Importantly, this method also reduces disclosure risk to essentially nil as no record in the synthetic dataset represents a real individual. This practical guide with accompanying R script enables biobehavioural researchers to create synthetic datasets and assess their utility via the synthpop R package. By sharing synthetic datasets that mimic original datasets that could not otherwise be made open, researchers can ensure the reproducibility of their results and facilitate data exploration while maintaining participant privacy.

Data availability

Data and analysis scripts are available at the article's Open Science Framework webpage https://osf.io/z524n/

The following previously published data sets were used
    1. Jones BC
    2. DeBruine L
    (2019) Sociosexuality and self-rated attractiveness
    Open Science Framework, DOI: 10.17605/OSF.IO/6BK3W.

Article and author information

Author details

  1. Daniel S Quintana

    Institute of Clinical Medicine, University of Oslo, Oslo, Norway
    For correspondence
    daniel.quintana@medisin.uio.no
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2876-0004

Funding

Novo Nordisk Foundation (Excellence grant NNF16OC0019856)

  • Daniel S Quintana

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2020, Quintana

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 4,454
    views
  • 395
    downloads
  • 67
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Daniel S Quintana
(2020)
A synthetic dataset primer for the biobehavioural sciences to promote reproducibility and hypothesis-generation
eLife 9:e53275.
https://doi.org/10.7554/eLife.53275

Share this article

https://doi.org/10.7554/eLife.53275

Further reading

    1. Medicine
    2. Neuroscience
    Joanna Kosinska, Julian C Assmann ... Markus Schwaninger
    Research Article

    Monomethyl fumarate (MMF) and its prodrug dimethyl fumarate (DMF) are currently the most widely used agents for the treatment of multiple sclerosis (MS). However, not all patients benefit from DMF. We hypothesized that the variable response of patients may be due to their diet. In support of this hypothesis, mice subjected to experimental autoimmune encephalomyelitis (EAE), a model of MS, did not benefit from DMF treatment when fed a lauric acid-rich (LA) diet. Mice on normal chow (NC) diet, in contrast, and even more so mice on high-fiber (HFb) diet showed the expected protective DMF effect. DMF lacked efficacy in the LA diet-fed group despite similar resorption and preserved effects on plasma lipids. When mice were fed the permissive HFb diet, the protective effect of DMF treatment depended on hydroxycarboxylic receptor 2 (HCAR2) which is highly expressed in neutrophil granulocytes. Indeed, deletion of Hcar2 in neutrophils abrogated DMF protective effects in EAE. Diet had a profound effect on the transcriptional profile of neutrophils and modulated their response to MMF. In summary, DMF required HCAR2 on neutrophils as well as permissive dietary effects for its therapeutic action. Translating the dietary intervention into the clinic may improve MS therapy.

    1. Medicine
    Hyun Beom Song, Laura Campello ... Anand Swaroop
    Research Advance

    Inherited retinal degenerations (IRDs) constitute a group of clinically and genetically diverse vision-impairing disorders. Retinitis pigmentosa (RP), the most common form of IRD, is characterized by gradual dysfunction and degeneration of rod photoreceptors, followed by the loss of cone photoreceptors. Recently, we identified reserpine as a lead molecule for maintaining rod survival in mouse and human retinal organoids as well as in the rd16 mouse, which phenocopy Leber congenital amaurosis caused by mutations in the cilia-centrosomal gene CEP290 (Chen et al., 2023). Here, we show the therapeutic potential of reserpine in a rhodopsin P23H rat model of autosomal dominant RP. At postnatal day (P) 68, when males and females are analyzed together, the reserpine-treated rats exhibit higher rod-derived scotopic b-wave amplitudes compared to the controls with little or no change in scotopic a-wave or cone-derived photopic b-wave. Interestingly, the reserpine-treated female rats display enhanced scotopic a- and b-waves and photopic b-wave responses at P68, along with a better contrast threshold and increased outer nuclear layer thickness. The female rats demonstrate better preservation of both rod and cone photoreceptors following reserpine treatment. Retinal transcriptome analysis reveals sex-specific responses to reserpine, with significant upregulation of phototransduction genes and proteostasis-related pathways, and notably, genes associated with stress response. This study builds upon our previously reported results reaffirming the potential of reserpine for gene-agnostic treatment of IRDs and emphasizes the importance of biological sex in retinal disease research and therapy development.