eLife supports development of open technology stack for publishing reproducible manuscripts online

The Reproducible Document Stack will ultimately allow authors to submit their manuscripts in a format that includes embedded code blocks and computed outputs, and for publishers to preserve these assets in an enhanced version of the published online article.
Press Pack
  • Views 2,081
  • Annotations

eLife, in collaboration with Substance and Stencila, is supporting a project to create an open stack of tools for authoring, compiling and publishing computationally reproducible manuscripts online.

eLife aims to help scientists accelerate discovery by operating a platform for research communication that encourages and recognises the most responsible behaviours in science. In July, the open-access publisher joined a consortium of organisations committed to supporting Substance, a JavaScript library of tools for web-based content editing. Now, as part of its mission, eLife is working with Substance to create the Reproducible Document Stack: a set of tools and frameworks through which publishers can present peer-reviewed, computationally reproducible documents online and in full.

By the end of the project, eLife aims to have developed and published a working prototype of a reproducible document, demonstrating a complete end-to-end technology stack from authoring through to publication. The Reproducible Document Stack will ultimately allow authors to submit their manuscripts in a format that includes embedded code blocks and computed outputs (statistical results, tables or graphs), and publishers to preserve these assets in an enhanced version of the published online article.

“The reproducibility of published biomedical research is currently an issue of widespread concern,” says Giuliano Maciocci, eLife’s Head of Product. “The idea of a reproducible document, or reproducible manuscript, is one where any source code and data used to generate research artifacts, such as statistical results, graphs, equations or tables, would be preserved right through the authoring, submission, peer-review and publication pipeline and then presented in a format that allows others to easily access it, ultimately leading to improved replication and reuse. In practice, this could mean clicking on a statistical plot to immediately see the code that generated it, and being able to tweak that code and generate the plot again to see the results – without ever leaving the browser.”

Currently, researchers are able to document their computational experiments through file formats such as R Markdown and platforms such as Jupyter. These files serve as lab records and can be shared independently from or alongside the resulting research article. However, there is no means for a researcher to present their research in this form through the traditional journal. Instead, current users of these technologies submit a “flattened” version of the documents to a journal, losing the value of embedded code and data references.

Additionally, while code and data can be shared as supplementary files to the manuscript or described within an author’s own publications, downloading the files to reproduce and modify analysis offline relies on a researcher being comfortable with the use of a computational analysis program. They also need the resources to replicate the author’s original development environment and its associated dependencies.

Offering a means to share reproducible documents through a standardised Reproducible Document Stack would help counter these challenges and incentivise the sharing of demonstrably reusable data and code underlying a research article. However, Maciocci says providing accessible code authoring and publication is only half the battle.

“To coexist harmoniously with traditional publishing infrastructures and remain accessible across different devices, such computationally enriched research artifacts still need to accommodate more basic forms of consumption,” he explains. “The Reproducible Document Stack will therefore incorporate the concept of progressive enhancement – a user experience design strategy that emphasises essential content first, but then progressively builds more technically rigorous layers of presentation and interaction onto the content as the end user’s browser and/or device allow.”

Michael Aufreiter, one of the developers behind Substance, adds: “To allow us to succeed in creating an effective reproducible document, it will also need to be supported right through the authoring, sharing and publication stages, and will require the development of new technology at all three of these touch points. Working closely with the eLife team, we hope to meet each of these milestones and pave the way for researchers to be able to publish entire reproducible manuscripts as part of an effort to improve the replicability of crucial biomedical findings.”

To learn more about the background to this project, please visit: https://elifesciences.org/labs/7dbeb390/reproducible-document-stack-supporting-the-next-generation-research-article

To find out more about progressive enhancement, please see https://elifesciences.org/labs/e5737fd5/designing-progressive-enhancement-into-the-academic-manuscript.

To find out more about eLife’s partnership with the Substance Consortium, see: https://elifesciences.org/for-the-press/f87f62a7/elife-joins-substance-consortium-to-support-development-of-open-source-online-content-editing-tools

Media contacts

  1. Emily Packer
    eLife
    e.packer@elifesciences.org
    +441223855373

About

eLife aims to make the communication of results more beneficial for the scientific community as a whole, by operating a platform for presenting research that encourages and recognises the most responsible behaviours in science. While eLife has made its name largely through its consultative approach to peer review and the papers it has published, the organisation seeks to improve all aspects of research communication in support of excellent science – from technology and infrastructure to the ways individuals receive recognition. eLife is supported by the Howard Hughes Medical Institute, the Max Planck Society and the Wellcome Trust. Learn more at elifesciences.org.