Bob Carpenter shares this story illustrating the challenges of software maintenance. Here’s Bob:
This started with the maintenance of upgrading to the new Boost version 1.69, which is this pull request:
for this issue:
The issue happens first, then the pull request, then the fun of debugging starts.
Today’s story starts an issue from today [18 Dec 2018] reported by Daniel Lee, the relevant text of which is:
@bgoodri, it looks like the unit tests for integrate_1d is failing. It looks like the new version of Boost has different behavior then what was there before.
This is a new feature (1D integrator) and it already needs maintenance.
This issue popped up when we updated Boost 1.68 to Boost 1.69. Boost is one of only three C++ libraries we depend on, but we use it everywhere (the other two libaries are limited to matrix operations and solving ODEs). Boost has been through about 20 versions since we started the project—twice or three times/year.
Among other reasons, we have to update Boost because we have to keep in synch with CRAN package BH (Boost headers) due to CRAN maximum package size limitations. We can’t distribute our own version of Boost so as to control the terms of when these maintenance events happen, but we’d have to keep updating anyway just to keep up with Boost’s bug fixes and new features, etc.
What does this mean in practical terms? Messages like the one above pop up. I get flagged, as does everyone else following the math lib issues. Someone has to create a GitHub issue, create a GitHub branch, debug the problem on the branch, create a GitHub pull request, get that GitHub pull request to pass tests on all platforms for continuous integration, get the code reviewed, make any updates required by code review and test again, then merge. This is all after the original issue and pull request to update Boost. That was just the maintenance that revealed the bug.
This is not a five minute job.
It’ll take one person-hour minimum with all the GitHub overhead
and reviewing. And it’ll take something like a compute-day on our continuous integration servers if it passes the tests (less for failures). Deubgging may take anywhere from 10 minutes to a day or maybe two in the extreme.
My point is just that the more things we have like integrate_1d, the more of these things come up. As a result, maintenance cost is quadratic in the number of features.
It works like this:
Let’s suppose a maintenance event comes up every 2 months or
so (e.g., new version of Boost, reorg of repo, new C++ version etc.). For each maintenance event, the amount of maintenance we have to do is proportional to the number of features we have. As a result, the amount of maintenance we have to do is quadratic (e.g., a linear growth in features looks like this: 1 + 2 + 3 + … + and we do maintenance at regular intervals, so the amount of time it takes is quadratic.
This is why I’m always so reluctant to add features, especially when they have complicated dependencies.