By Hannah Sonntag (ORCID) and Thomas Lemberger (ORCID), EMBO
Life science today is a diverse and interdisciplinary field. Labs working in different fields need to be able to collaborate and exchange their results in spite of the diversity of data types and experimental approaches. Many scientists engaged in interdisciplinary collaborations stumble across problems when sharing their results: How can I show my results to colleagues from another field without being overly technical? How can I provide my collaborators comprehensive access to my raw dataset? Where did I keep the data I produced two years ago?
In this regard, figures have the potential to greatly simplify the communication of scientific data across labs and disciplines. Presenting results in scientific figures is one of researchers' everyday tasks, and perhaps one of their most valuable outputs. Figures are the de facto standard for the exchange and evaluation of scientific data in the lab, in papers and at conferences. For these reasons, we propose to use figures as a natural starting point for organizing the underlying source data files, computer scripts, and protocols in a standardized way.
The SourceData (sourcedata.io) platform was developed by EMBO using a standardized system to represent the key elements of an experiment in a structured, machine-readable way. SourceData is coupled to a public data repository (the EBI database BioStudies), which allows linking results to their underlying experimental source data (Liechti et al, 2017). This platform is currently used for the routine curation of figures published in EMBO Press journals. Figures processed by SourceData are freely available online as interconnected "SmartFigures", linking related content together. This allows users to easily retrieve relevant scientific results from different papers and journals. Annotated elements are linked to biological knowledge databases and, whenever available, the source data files, which are directly downloadable from the SmartFigure. Moreover, users can quickly retrieve SmartFigures using SourceData’s semantic search engine or Google Dataset Search.
We are currently developing new community-oriented open-source tools based on these principles to enable researchers to easily create their own SmartFigures and share them with their colleagues.
A dashboard to share SmartFigures
The first tool is the SmartFigure Dashboard– or SDash with which researchers can easily create SmartFigures by dragging and dropping their result figures onto their personal dashboard and linking them with local or remote data files, computer scripts, and protocols.
The platform's artificial intelligence engine automatically extracts an initial structured representation from the figure caption that serves as a starting point for organizing, interconnecting and browsing SmartFigures.
SDash allows users to create 'sharing groups' so that SmartFigures can be shared within a circle of interdisciplinary collaborators. Users in this group can download SmartFigures as self-contained data packages, examine and (re-)analyse the presented results and the underlying raw data. Within sharing groups, online commenting allows collaborators to exchange ideas, promote critical debates about the shared results or simply clarify questions. While opening the interdisciplinary dialogue, users have full control over their data with maximum flexibility on how to share their content.
SmartFigures as portable packages
The second tool is the desktop SmartFigure Editor which allows users to edit SmartFigures packages downloaded from SDash or to create them from scratch.
The SmartFigure Editor was developed in collaboration with Michael Aufreiter and Oliver Buchtala using their open source Substance Javascript library, which is also the code base for Texture, an online full-text editor also developed by Substance.
With SmartFigures we are in the process of implementing open standards such as schema.org for the indexing of online SmartFigures, JATS for compatibility with publishing platforms, and IIIF or similar technologies to enable distributed data and image storage. In particular, the SmartFigures Editor is designed for straightforward integration with preprint and publishing platforms, thus opening the door to publicly sharing scientific results linked to primary data.
We want to open new avenues for researchers to use their own network of trusted collaborators to quickly communicate their findings at an early stage and receive expert feedback prior to formal publication and public dissemination.
Local openness to accelerate global open science
When looking at some of the scientific discoveries that led to a Nobel Prize, it is worth noting that some major breakthroughs were reported quite differently when compared to scientific communication today. Watson and Crick’s paper was one page long (Watson and Crick, 1953); O’Keefe first reported on place cells in the form of preliminary results shown in a single figure (O'Keefe, 1971). Such early dissemination of specific findings is hampered today in an ultra-competitive environment, and in a context that often encourages authors to overload their papers with a deluge of supplementary figures.
With the SmartFigures tools, we want to open new avenues for researchers to use their own network of trusted collaborators to quickly communicate their findings at an early stage and receive expert feedback prior to formal publication and public dissemination. The SmartFigures suite bridges the open sharing of research findings amongst collaborators (local openness) with their subsequent open public dissemination (global openness), allowing a frictionless, yet controlled sharing of results and data. Thus, it will ultimately serve as a catalyst to accelerate open science communications on a global scale.
At the moment the SDash platform is still in the pilot phase. If you or your lab, institute or research consortium is interested in using the SDash platform or the SmartFigures editor, please contact us (also hannah [dot] sonntag [at] embo [dot] org, or thomas [dot] lemberger [at] embo [dot] org). We would be happy to gather early feedback and continue to shape the SDash platform to the community's needs.
EMBO is developing SDash in close collaboration with Julien Colomb and Matthew Larkum from the SFB1315 Z project at the Humboldt University of Berlin, (SFB1315 Consortium "Mechanisms and disturbances in memory consolidation") and Robin Liechti and Orlin Topalov from SIB Swiss Institute for Bioinformatics (SIB).
Bibliography
Liechti R, George N, Götz L, El-Gebali S, Chasapi A, Crespo I, Xenarios I, Lemberger T (2017). SourceData: A Semantic Platform for Curating and Searching Figures. Nat Methods 14:1021-1022
O’Keefe J, and Dostrovsky J (1971). The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely‐moving rat. Brain Research 34:171-175.
Watson JD, Crick FH (1953). Molecular Structure of Nucleic Acids; A Structure for Deoxyribose Nucleic Acid. Nature 171:737–738
#
We welcome comments, questions and feedback. Please annotate publicly on the article or contact us at innovation [at] elifesciences [dot] org.
Do you have an idea or innovation to share? Send a short outline for a Labs blogpost to innovation [at] elifesciences [dot] org.
For the latest in innovation, eLife Labs and new open-source tools, sign up for our technology and innovation newsletter. You can also follow @eLifeInnovation on Twitter.