- Views 294
By Aziz Khan and Tomasz Konopka
Placing research in the context of existing knowledge is a critical step in the scientific process; we routinely look for information about genes, chemicals and drugs, genetic variants, and other entities. Vast amounts of relevant data are available in databases, and their breadth and depth are growing. The ClinVar archive which collects associations between genetic variants and diseases, for example, increased in size by 60% in the past year. At the same time, smaller datasets are published on a regular basis. The knowledge landscape in the biomedical domain is changing rapidly and that is exciting, but this rapid change and growth also create challenges in how to effectively use this wealth of information in day-to-day research.
How do we currently use open biomedical data in everyday situations? Querying a search engine, of course, is a familiar route that can reveal relevant primary sources. Alternatively, it is possible to navigate to a specific data portal or data-integration platform and search on the domain-specific site. These approaches are effective, but it can nonetheless be time-consuming to browse multiple entities, find niche details, and verify that a data aggregator has up-to-date information.
Such concerns are continually discussed by data creators as well as data curators. The FAIR initiative, in particular, published guiding principles for maximising the value of open data. These provide concrete definitions for what it means for data to be findable, accessible, interoperable and reusable (FAIR). As a result of these principles, as well as independent developments in web technology and the efforts of biocuration teams, much biomedical data are today accessible from primary sources through direct channels.
The ability to download data from primary sources, whenever needed, opens exciting possibilities for data discovery and analysis. Aiming to streamline access to open data, we developed a browser extension called FAIR-biomed for previewing snippets of information from any web page. It works inside a browser to access specialist databases without the need to switch tabs, open new windows, or type URLs.
The extension takes some text as input, provides a list of relevant biomedical resources, and composes queries to a chosen database. It then fetches results from that resource and displays a subset of information. Within a few keystrokes, it is possible to see a topic from several perspectives.
The idea of previewing data within web pages has a long history. For example, Reflect was an early example of a browser extension, and GIX is a more modern project. These tools focus on providing information about genes from preselected data sources. FAIR-biomed is similar in some regards, but also offers unique features and possibilities. It defers data curation to the specialised resources and only retrieves relevant portions on demand. It is also not limited to searching for gene names and can thus inform on publications, mouse models, genomic regions, variants, and other entities.
FAIR-biomed is open-source (GitHub repository) and has a modular structure. Its library comprises independent components that are each responsible for interfacing with one data resource.
We are also steadily adding more plugins to FAIR-biomed’s library. We invite you to try the extension out, and please do let us know if there are specific plug-ins that you would like to see in FAIR-biomed. We welcome all feedback from the research community.
Also compatible with other browsers, see installation documentation.
We welcome comments, questions and feedback. Please annotate publicly on the article or contact us at innovation [at] elifesciences [dot] org.
Do you have an idea or innovation to share? Send a short outline for a Labs blogpost to innovation [at] elifesciences [dot] org.