Blogpost by Richard Smith-Unna, Fathom Labs (Nairobi), COKO Foundation and Code for Science
We just released ScienceFair, a new Open Source desktop science library that you can download from http://sciencefair-app.com. It is full of exciting new technology, and is driven by a vision of how the future of scholarly literature could be better.
In this post I’ll explain the motivation and story behind ScienceFair. I’ll dive into what makes it different, how it works and how you can get involved.
The way we access, read and reuse scientific literature is largely controlled by a few vast publishing organisations. In some cases, these organisations are controlling the literature deliberately as part of their business model. In other cases, such as with PubMed, the platform aims to distribute literature freely. In all cases, these platforms provide a single, inflexible way for people to access and use the literature.
At the same time, organisations like eLife are doing things differently, challenging the status quo and driving innovation. There are countless other examples of projects that have good ideas and technology that make a positive difference in the way science is communicated and consumed. However, such innovations rarely make the transition from these projects into the massive-scale literature platforms that most people use. People only experience these innovations when they use specific websites, install browser extensions, or temporarily move away from their usual workflows to experience new things. This results in change happening very slowly even though innovation is happening fast.
The team behind ScienceFair have a vision of a different, better, future for science. A future that's more fair, inclusive and open. A future that makes it easy to innovate and explore, and where users control and customise their experience. The ScienceFair app is one road by which we think this could be achieved, and this v1.0 release marks the first step down that road.
All the people working on ScienceFair were thinking about the problems discussed above before we ever knew about each other. A series of fortunate meetings and introductions brought us all together. In particular, Fathom Labs, the Dat team, the Mozilla Science Lab, and the Freeman lab - now all loosely organised under the umbrella of Code For Science - met at CSVconf Berlin in Spring of 2016. There we also met the Substance team, who made the connection between our vision and that of eLife, and introduced us. From then, Fathom Labs and eLife collaborated on designing the user experience of the app and funding the development work that got us to v1.0.
ScienceFair’s first release introduces several new concepts and technologies that have never been used before in desktop science library apps. These include:
- Instant multi-source search as the main user interface
- A science-focused article reading interface using eLife Lens
- Live-updating cryptographically secured peer-to-peer distributed datasources
- Inclusion of data mining in the basic interface
Instant multi-source search
When you open ScienceFair, you see a very simple interface: a search bar and an invitation to use it.
There are no buttons or menus - searching is how you find things. When you start typing a search query, ScienceFair instantly starts searching in multiple databases. It looks in your local collection - these are the papers you have downloaded. It also searches any datasources you have subscribed to (by default, eLife is the only datasource). If there are any results, they will start displaying instantly, with papers from your local collection and your datasources all combined into a single result stream. The most relevant papers are shown first, and you can start exploring the results whilst more are arriving.
The idea of this interface is to have the smallest barrier possible between your thought process and your use of the app. If you’re thinking it, type it in the search and see what starts appearing. This was one of the key design insights that arose from our collaboration with eLife. We knew we wanted to have a simple, clean interface, but eLife’s Head of Product, Giuliano Maciocci, had the expertise and experience to crystallise these ideas into guidelines for the design. With his help, we came to focus on two things: search as the main interface, and incremental discovery as the mechanism for users to learn how to use the app. This is a lesson we’ve carried forward into all our subsequent software work. Incremental discovery involves making it obvious which next step a user should take, while never exposing too many choices at once. The user comes to learn all the possibilities of the software in a way that feels natural and not overwhelming.
The search system in ScienceFair was built on Google’s leveldb as the underlying data store, with `search-index` by Fergie McDowall providing the clever indexing, and our own `yunodb` taking care of text analysis and making the whole thing fit nicely into a cross-platform app.
One of the most important innovations we included in v1.0 is the Lens reader, developed by eLife and Substance. We believe Lens is a shining example of how much things can improve when you re-imagine a system from the ground up, and we knew from the start that we wanted it in ScienceFair. Although the Lens reader is open source, it has so far only been used by eLife and a few other journals to present research online. ScienceFair brings the Lens reader to your desktop, and we’ve adapted it to work with millions of open-access papers that will become available through ScienceFair this week.
Lens is the only open-source article reader we know of that really takes advantage of the computer as a reading device, rather than treating it as a piece of paper. By displaying an article in two panes - main text on the left, contextual information (such as figures and references) on the right - Lens allows you to keep your focus on the content of the paper whilst being able to easily access detail.
We are particularly grateful to the team from Substance who originally developed Lens, as they introduced us to the eLife team. The Substance folks have gone on to take web-based content creation and presentation to new levels, and we are excited to integrate more of their developments into future versions of ScienceFair.
Crucial to ScienceFair is the concept of subscribing to datasources. A datasource is a live-updating stream of articles. You can think of it as like subscribing to a mailing list, but where all the things you subscribe to get automatically added to a search engine on your computer. This is extremely powerful because without any additional effort on your part, your regular research stays up to date.
Anyone can create a datasource and share it with others, and the app comes with eLife as the default datasource. When new eLife papers are released, all ScienceFair users subscribed to eLife get new papers added to their search index automatically. In the coming weeks we will add all openly licensed papers from PubMed Central as another default source, and we will release tools for people to create their own datasources.
We have big plans for datasources in the future. For example, the same underlying technology will allow a user to create a new datasource from one or more keyword tags used to organise their local collection. When they share the datasource key with a group, the others with the key will add it to ScienceFair once, and then have a live updating feed of the shared papers. Whenever the original user adds new papers to the tags, all the others in their group will get the same papers updated on their machines. We’ve imagined how this might support journal clubs, study groups, and colleagues collaborating on research. We have users who want to create a live feed of new papers, posters and preprints during a conference, so attendees can subscribe to relevant information during events. We’re looking forward to seeing what new uses people find that we haven’t imagined.
Although datasources "just work” in ScienceFair, under the surface they are the most technically challenging and exciting innovation we’ve included. Every datasource is a Dat feed. In technical terms it’s an append-only log, signed with public key encryption by the creator, and distributed peer-to-peer by a swarm of users. It’s secure, private (unless you decide to share it), and amazingly, it all works faster than normal downloading. In our tests we were able to transfer data between servers in different cities around the world around twice as fast using Dat than using any other method. This is incredible - we re-ran the tests many times as we didn’t believe them at first. We are extremely grateful to the Dat team, in particular Mathius Buus (@mafintosh), Max Ogden (@denormalize), Karissa McKelvey (@okdistribute), and Yoshua Wuyts (@yosh), who understood the value of ScienceFair from the beginning, and have donated tremendous energy and time to make it happen.
Several of the members of the ScienceFair team have worked on text and data mining in the past. One of the major lessons we all learned is that the gap between how the tools work and what users need is huge. Usually the tools are command-line programs with no graphical interface - it takes a lot of technical learning to get started. In ScienceFair we came up with a way to bring data mining tools into a more familiar environment: you can turn any search, selection, or tag into a bibliometrics data dashboard.
In this first release, we’ve kept it simple: you can see the keywords shared by a collection of papers, who are the authors with the most papers, and how many papers were published each year. But we have plans to make this a key part of ScienceFair in the future, because we believe that currently most researchers are not benefitting from the many breakthroughs in text and data mining.
We have only just got started on ScienceFair. We have a roadmap that will bring us gradually closer to the future ecosystem we want to see:
- v1.1 will focus on datasources: we will be steadily releasing new datasources in partnership with publishers, aggregators and others groups. We’ll be creating new tools for people to create and maintain datasources, and adding features to ScienceFair to allow users to make and share datasources from their collections. We’ll also be making a new web community where people can share and discover interesting datasources.
- v1.2 will focus on enrichment: we’ll be adding features that show altmetrics and updates (e.g. retractions) in the context of articles in the app, updated in real time. We’ll incorporate that data into the bibliometrics tools, and improve those tools to support more advanced analyses. We’ll also be connecting papers to sources of post-publication peer review and adding interfaces for commenting and discussion of articles.
- v2.0 will focus on customisation: we’ll be adding a package manager, like a free app store, to ScienceFair. This will allow users to customise and specialise their experience to suit their needs, and we will be helping a growing community of developers to bring their ideas to life inside ScienceFair. As with datasources, we’ll build a web community where people can share and discover new packages for ScienceFair.
There are many options available for researchers looking to manage their references - Zotero is our favourite, but alternatives like PaperPile, Papers and Mendeley are popular. With so many existing apps, why would anyone want to use ScienceFair?
We think ScienceFair is something completely different. It’s a desktop library and research tool, not a reference manager (although you will be able to use it for reference management in an upcoming release). It’s also a way to re-imagine how we manage and use the scholarly literature. We hope that we’ve succeeded in communicating our vision, and that people will want to take this journey with us.
ScienceFair is an app you can download right now: go to http://sciencefair-app.com to do that. We currently provide installers for Windows, Mac, and various kinds of Linux.
We are also an open community. If you care about science, the scholarly literature and the future of those things, we invite you to join us. We have an open chat room on IRC where someone is always available to chat, and an open issue tracker where you can make suggestions and report bugs. You can follow us on Twitter at @ScienceFairApp.
The source code for ScienceFair is licensed under the MIT License and available at https://github.com/codeforscience/sciencefair.