A synthetic dataset primer for the biobehavioural sciences to promote reproducibility and hypothesis-generation

  1. Daniel S Quintana  Is a corresponding author
  1. University of Oslo, Norway

Abstract

Open research data provides considerable scientific, societal, and economic benefits. However, disclosure risks can sometimes limit the sharing of open data, especially in datasets that include sensitive details or information from individuals with rare disorders. This article introduces the concept of synthetic datasets, which is an emerging method originally developed to permit the sharing of confidential census data. Synthetic datasets mimic real datasets by preserving their statistical properties and the relationships between variables. Importantly, this method also reduces disclosure risk to essentially nil as no record in the synthetic dataset represents a real individual. This practical guide with accompanying R script enables biobehavioural researchers to create synthetic datasets and assess their utility via the synthpop R package. By sharing synthetic datasets that mimic original datasets that could not otherwise be made open, researchers can ensure the reproducibility of their results and facilitate data exploration while maintaining participant privacy.

Data availability

Data and analysis scripts are available at the article's Open Science Framework webpage https://osf.io/z524n/

The following previously published data sets were used
    1. Jones BC
    2. DeBruine L
    (2019) Sociosexuality and self-rated attractiveness
    Open Science Framework, DOI: 10.17605/OSF.IO/6BK3W.

Article and author information

Author details

  1. Daniel S Quintana

    Institute of Clinical Medicine, University of Oslo, Oslo, Norway
    For correspondence
    daniel.quintana@medisin.uio.no
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2876-0004

Funding

Novo Nordisk Foundation (Excellence grant NNF16OC0019856)

  • Daniel S Quintana

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2020, Quintana

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 4,371
    views
  • 382
    downloads
  • 64
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Daniel S Quintana
(2020)
A synthetic dataset primer for the biobehavioural sciences to promote reproducibility and hypothesis-generation
eLife 9:e53275.
https://doi.org/10.7554/eLife.53275

Share this article

https://doi.org/10.7554/eLife.53275

Further reading

    1. Medicine
    Jeong-Oh Shin, Jong-Bin Lee ... Jin-Woo Kim
    Research Article

    This study investigates the effects of two parathyroid hormone (PTH) analogs, rhPTH(1-34) and dimeric R25CPTH(1-34), on bone regeneration and osseointegration in a postmenopausal osteoporosis model using beagle dogs. Twelve osteoporotic female beagles were subjected to implant surgeries and assigned to one of three groups: control, rhPTH(1-34), or dimeric R25CPTH(1-34). Bone regeneration and osseointegration were evaluated after 10 weeks using micro-computed tomographic (micro-CT), histological analyses, and serum biochemical assays. Results showed that the rhPTH(1-34) group demonstrated superior improvements in bone mineral density, trabecular architecture, and osseointegration compared to controls, while the dimeric R25CPTH(1-34) group exhibited similar, though slightly less pronounced, anabolic effects. Histological and TRAP assays indicated both PTH analogs significantly enhanced bone regeneration, especially in artificially created bone defects. The findings suggest that both rhPTH(1-34) and dimeric R25CPTH(1-34) hold potential as therapeutic agents for promoting bone regeneration and improving osseointegration around implants in osteoporotic conditions, with implications for their use in bone-related pathologies and reconstructive surgeries.

    1. Medicine
    2. Neuroscience
    Sophie Leclercq, Hany Ahmed ... Nathalie Delzenne
    Research Article

    Background:

    Alcohol use disorder (AUD) is a global health problem with limited therapeutic options. The biochemical mechanisms that lead to this disorder are not yet fully understood, and in this respect, metabolomics represents a promising approach to decipher metabolic events related to AUD. The plasma metabolome contains a plethora of bioactive molecules that reflects the functional changes in host metabolism but also the impact of the gut microbiome and nutritional habits.

    Methods:

    In this study, we investigated the impact of severe AUD (sAUD), and of a 3-week period of alcohol abstinence, on the blood metabolome (non-targeted LC-MS metabolomics analysis) in 96 sAUD patients hospitalized for alcohol withdrawal.

    Results:

    We found that the plasma levels of different lipids ((lyso)phosphatidylcholines, long-chain fatty acids), short-chain fatty acids (i.e. 3-hydroxyvaleric acid) and bile acids were altered in sAUD patients. In addition, several microbial metabolites, including indole-3-propionic acid, p-cresol sulfate, hippuric acid, pyrocatechol sulfate, and metabolites belonging to xanthine class (paraxanthine, theobromine and theophylline) were sensitive to alcohol exposure and alcohol withdrawal. 3-Hydroxyvaleric acid, caffeine metabolites (theobromine, paraxanthine, and theophylline) and microbial metabolites (hippuric acid and pyrocatechol sulfate) were correlated with anxiety, depression and alcohol craving. Metabolomics analysis in postmortem samples of frontal cortex and cerebrospinal fluid of those consuming a high level of alcohol revealed that those metabolites can be found also in brain tissue.

    Conclusions:

    Our data allow the identification of neuroactive metabolites, from interactions between food components and microbiota, which may represent new targets arising in the management of neuropsychiatric diseases such as sAUD.

    Funding:

    Gut2Behave project was initiated from ERA-NET NEURON network (Joint Transnational Call 2019) and was financed by Academy of Finland, French National Research Agency (ANR-19-NEUR-0003-03) and the Fonds de la Recherche Scientifique (FRS-FNRS; PINT-MULTI R.8013.19, Belgium). Metabolomics analysis of the TSDS samples was supported by grant from the Finnish Foundation for Alcohol Studies.