What software licence is best suited for an Open Access ecosystem?

By Ian Mulvany

You are building open source software for publishing open access research, what software licence do you pick? This is exactly the question that we are thinking about right now at eLife. We announced at the beginning of the summer that we are going to open source our publishing platform Continuum, but before we do that, we have to pick a licence for it.

The Budapest open access initative called for ensuring that for scholarly literature there is

… free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited

CC-BY is recommended as the optimal licence that enshrines the above definition of open access, so the question we may be facing in relation to software licences is which open source licence is closest in spirit to the CC-BY creative commons licence, at least so I thought initially.

When looking at open source licences for the purposes of our kind of a project there are really only two classes of licence to consider, a permissive MIT-like licence or a viral GPL-like licence.

An MIT-like licence only requests attribution when the software is re-used. A GPL-like licence asks for this, but in addition any work that is derived from this software also has to be made open under the same conditions, a share-alike clause.

In my mind I’ve felt that the following mapping shows the analogies:

licence to cc

So if we want to mirror CC-BY, we should pick an MIT licence, right? Well, maybe, I certainly initially thought so, and I also thought that any other type of licence would reduce the appetite of other large publishers from looking at our system.

I tested this assumption by talking to the technical leaders of a number of large publishers, and asking them what kinds of licences were important for them when it comes to choosing open source software. The only consistent message was that they just don’t want to be locked into support contracts whose prices ratchet up each year. They were uniformly unfazed by the differences between MIT-like and GPL-like license. One said to me that it’s really not that much of an issue. I had assumed that this target audience would strongly prefer the MIT-like licence.

I also assumed that OA advocates would have the same attitude and prefer an MIT-like licence as it seems to map most closely to CC-BY. I had the opportunity to pose the question to Jean Claude Guédon, one of the original signatories of the Budapest Open Accesses Initiative, however he expressed a strong preference for a strong GPL licence, on the basis that this kind of licence is better suited to promoting the creation of a truly open scholarly publishing infrastructure.

So what do I think now? I think that at its heart this should really be about what effect eLife wants to have in open sourcing our platform, and that there is almost certainly no absolutely correct answer. I think that making a simple analogy, as I have done above, oversimplifies the question, and yet we still have to make a decision. As for what we want to achieve, our mission is

To help scientists accelerate discovery by operating a platform for research communication that encourages and recognises the most responsible behaviours in science.

That does not speak to what we might want to achieve by open sourcing our code, but we have a view internally that if we find effective ways of doing things, and the things that we find are picked up across the scholarly communication ecosystem, then that is a good thing in general. That means that we are happy with the idea that what we produce could be picked up and used by other publishers, or any other group who has an interest in aspects of how research is published. Is that goal accomplished better with an MIT-like licence or a GPL-like one?

Is this even a question worth raising? Well, It’s probably not the most critical question, but I do think it is worth asking. I’ve been working on software development of tools for researchers for a long time, and that the answer to this question may not be clear cut is interesting to me, I’m really interested in finding out what other voices in the community think.