By Peter Kraker, Christopher Kittel and Maxi Schramm
If you have been using Open Knowledge Maps over the past eight months, you were working with software that was undergoing major changes. During that period, we rewrote the technical foundation for the OKMaps frontend, but chances are that you didn’t even notice the change. In fact, errors and reported issues declined during that period, even though we saw a major uptick in usage of our platform with an average increase of around 50% per month in created knowledge maps.
Software rewrites are notoriously difficult to pull off. They are hard to plan, as there are many unknowns in the process. There is often a steep learning curve involved when re-creating your software with new technologies. In addition, there is a tension between day-to-day needs of your stakeholders and the needs of the rewrite, leading to delays and dead ends.
As a result, many of these projects are broken off, or they take much longer than anticipated. In fact, at Open Knowledge Maps, we had previously failed twice to rewrite our frontend. So how did we manage to pull it off this time? How were we able to minimize the delays? And why did we need a rewrite in the first place?
To understand all of this, we need to go back in time to the origins of Open Knowledge Maps.
Head Start, the software underlying all Open Knowledge Maps services, was developed in an agile manner. The first version that was released around eight years ago was only a prototype. In housing terms, you could say that it was a nice shed: a single-purpose room that lacked structure and space for reconfiguration.
People liked Head Start and they saw a lot of potential in it. So over the years, it received a foundation, basic plumbing, and many different extensions, which ultimately turned it into a multi-room house with a variety of new functions. A lot of users, residents if you will, moved into the house, and they stayed for longer and longer periods of time.
But there was a problem: with every new extension, the structural integrity became more difficult to manage. The different extensions were imbalanced and threatened to fall over. In the beginning, we could easily fix these issues, but as time went on, these fixes added up and became unmanageable. In addition, we had some extensions in mind that simply were not possible in the current configuration. It became clear that we needed to address these issues and put our house on a completely new foundation.
This is where the eLife Innovation Initiative came in. One of the learnings from our previous, failed attempts was that we needed outside help: a bigger budget and support from someone who had expertise in such rewrites. Unfortunately, there are not many funders out there that sponsor the maintenance efforts of existing infrastructures. So it was much to our delight when the eLife Innovation Initiative considered supporting our plans.
Early on in our discussions, the eLife team made clear to us that a complete rewrite as we had planned comes with big risks: first of all, it’s very difficult to build a new house from scratch that is equal to the old house. We were prone to run into unforeseen issues leading to delays. In addition, we still needed to extend the old house while the new house was being built. This would make it very hard to coordinate a moving date, and it was likely that we would never really be able to catch up with the newest extensions of the old house. As such, our new house might be a long time away, and even when we’d be finally able to move, we would miss many of the features of the old house.
This was certainly not what we wanted. After a few conversations with the eLife team, we decided to go with their suggestion and do a refactoring instead. In a refactoring, you add a new base to the house and redo the existing house room by room, while the people are still living in it. With this approach, everything needs to be planned very carefully, but in return it would be possible to avoid many of the pitfalls of a complete rewrite. Instead of one big move, it would be many small moves, which would make negotiating timelines much easier.
We had a promising plan, but nevertheless it was clear that there were some challenges ahead. After all, we had given ourselves just six months to complete the rewrite, and we needed to reconcile the refactoring with other ongoing work and planned releases.
We are happy to say that we completed the refactoring not only without major technical issues, but also with just a minimal delay. We attribute this to a number of factors. First, we had a detailed plan of the project that we regularly adjusted based on the knowledge that we gained during the course of the development.
Second, we broke down the software into several well-defined components that could each be refactored in a few weeks. We also decided to first produce a Minimum Viable Product (MVP) with a component that was complex enough to be representative for the whole refactoring, but not too big to be overwhelming. This gave us room to identify the most effective workflow and tune our processes.
Third, we did a lot of reviews after each component refactoring, both of the code and of the whole system. At the heart of this were automated tests, but also painstaking manual evaluations by the project team: the goal was to reproduce the exact same behaviour as the original code, apart from performance improvements and obvious bugs. When problems occurred, we attempted to mitigate them as soon as possible. As such, we made sure that we did not have too many bad surprises from problems that accumulated without our knowing.
The successful refactoring of our frontend enables us to “raise the roof” of our house. We now have the technical foundation to implement the visualization features that we promised in our roadmap. This includes new modes of visualization such as a mobile version of the knowledge map. The refactoring also makes it possible to implement certain features that are in high demand from our users, such as enabling navigating the map using the browser’s back and forward buttons or sharing the contents of a specific map region. We will also be able to implement new functionality with less effort than before.
Now, an even bigger challenge awaits: in order to realize our roadmap, we need to secure enough funding to implement the next development goals. This includes a model for custom services where institutions can include Open Knowledge Maps visualizations as cloud components. The components can be restricted to specific data sources, which, among others, enables visual discovery over the resources of an institution or a special collection, such as a research data management system.
To fund and sustain this infrastructure, we operate a crowdsourcing model: libraries become supporting members of Open Knowledge Maps and provide an annual contribution. In return, the supporting members become part of the organisation’s governance and are directly involved in the decision-making process by way of the Board of Supporters. For more information on this model, please see our website.
Initially, we defined primary and secondary goals of the refactoring in line with our overall product roadmap. This helped us prioritize tasks later on. We did a first code review, and created specification and review templates. Internally, we agreed on meeting schedules and formats, and we reviewed our skills to identify possible needs for external help.
The following image shows the four components, into which we divided the refactoring: the title and context line (green box on the top left), the knowledge map (blue box on the left), the list (purple box on the right), and additional components (cyan box on the outer left, consisting of, e.g., modals).
We then had to make a foundational decision: which framework to choose, Vue.js or React? We had some experiences with both frameworks from our previous rewrite attempts, but we had never made a definitive decision.
Based on the initial code review and the project goals, both frameworks were still considered valid contenders. Since our main front-end developer had previous knowledge of React, we were leaning towards this framework. We were not sure if the zoom animations would work smoothly in React, because the framework adds some overhead to each interaction, and some of the tests in our previous rewrite attempts had indicated that this might be a problem. To make the final call, we prototypically implemented all the main features in React. Luckily, the prototype showed a satisfactory level of performance, and we felt confident with the decision to adopt React as our new framework.
After this was settled, we made some further important architectural decisions: to use Redux to manage the internal state of the application, and to add an intermediate layer as the single point of communication between the old code and the new code. The intermediate layer is a clever device, designed in a way that it clearly separates old from new code, so both can exist independently. It gave us the ability to switch old and new components on and off at will, so that we could swap them around without any noticeable change from a user perspective. We incurred a temporary increase in complexity of the code base, but we gained a high level of control because everything moved through this layer, which is invaluable in an application with a large number of interactions and possible states.
The next step was to produce the first component (the MVP component, as described above), for which we chose the title and context line. To do so, we put into practice a refactoring cycle we devised earlier:
- Create a specifications document
- Do an in-depth code review of the component
- Devise an implementation plan
- Write code
- Conduct tests and reviews
- Improve documentation
In the specifications document, we defined the scope of the component and a concrete reviewable outcome of the refactoring cycle. This was extended by a code review, where we identified risks and dependencies, and documented all the functions, variables and data that we needed to move.
Based on this, we devised the implementation plan, where we made sure to work on high-risk and uncertain items first. Attacking high-risk items early on, coupled with regular risk monitoring helped us keep the project together. This included, for example, how to migrate third-party dependencies to the new framework, how to interweave the old mediator-pattern with the new Redux state-store without conflicts, or whether the tight development schedule works for a team spread over multiple projects.
We then put the implementation plan into practice in several development cycles (sprints) of one or two weeks duration, depending on the task. The implementation was complemented by extensive tests and reviews. We created automated tests for each component, but also manually tested the implementation after each review. We defined test cases that covered generic and more special searches, and we reviewed the whole system from an end-user’s perspective. To do this, we prepared two instances: one with the old code base and one including the newly refactored component. We then painstakingly compared the old system to the new system to see if it delivered the exact same behaviour (apart from performance improvements and obvious bugs).
After the first component was implemented, we completed the refactoring cycle another three times for the major components (knowledge map, list and additional components). Finally, we devoted a couple of sprints to code cleaning and documentation, and also held a technical debrief meeting, in which we reviewed the final state of the code base. For the final documentation, we focused on high-level architecture diagrams. These provide an understanding of the software that can not be readily found in the code. The advantage is that this type of documentation does not get outdated as quickly as a more fine-grained one.
The completion of the refactoring yielded a more modular application structure with increased reusability and adaptability of components. It gives us a better control over application states that makes it easier to add new interactions. The refactoring simplified dependencies between components and enabled us to update the documentation. We also were able to improve the performance of certain components and could fix a number of bugs along the way.
All in all, the project was a huge success. It also opened our eyes as to how to improve our code base going forward. In the future, we will make these improvements as part of our daily work, so as to avoid the need to make another such a large-scale refactoring effort in the frontend anytime soon.
We welcome comments, questions and feedback. Please annotate publicly on the article or contact us at innovation [at] elifesciences [dot] org.
Do you have an idea or innovation to share? Send a short outline for a Labs blogpost to innovation [at] elifesciences [dot] org.
For the latest in innovation, eLife Labs and new open-source tools, sign up for our technology and innovation newsletter. You can also follow @eLifeInnovation on Twitter.