the fun and games of testing a legacy codebase

In the local .Net Users group meeting this week Robert Stackhouse talked passionately of the beauty of Test Driven Development, the confidence and pleasure he derives from having end-to-end tests in place, emotional high he gets from seeing the green bar of the passing test...
And towards the end a question came up: how do you get there?
Suppose you have a working application with not a single test. Oh, some testing has undoubtedly been done once upon a time, when the application was first developed, but no tests remain.
Now users would like some additional features. And of course they expect that currently working functionality will keep working.
What's a poor developer to do?
Traditionally, books recommended to approach this kind of project "thoughtfully and carefully". In practice, that meant giving a seriously senior (and expensive) person a long time to work on it. That person read code for a few weeks, wrote a bit of new functionality, created a bunch of documents describing how the newly added functionality would work, and then held his breath as the new version went into production. The bugs still abounded.
Now, since the unit tests, and functional tests, and all other sorts of tests have been invented and became widely used, lets see if we can do any better.
Unit tests do provide confidence that changes in the code cause only desired changes in application behavior. However, back-fitting in unit tests for a legacy application is often extermely hard, and almost always takes awhile. Not to mention it is nearly impossible to justify in terms of business value, unless there is a certainty that the application will continue to evolve for a long time.
Creating a small suite of functional tests that exercise the application end-to-end is usually a lot easier. It does not give you as good a certainty in what the code is currently doing, but it does pin down the application's behaviors.
More importantly, this kind of test suite is very useful from the start, from the very first test. Unit tests' value is relatively little while there are few of them, but grows fast as coverage goes over 50%-60%.
Having prepared an end-to-end test suite, the developer can make changes to the application with a little more confidence. The next step is to create unit tests in the area of the code that will be changing in order to accommodate new features. A few things are accomplished in this process. First, by writing unit tests the developer will get intimately familiar with the code. Most likely, the developer will want to also refactor some of the code in the process, thus increasing code quality. Having an end-to-end test suite helps establish that no changes in application behavior aka bugs are introduced during this stage. Second, creating unit tests for the relevant code allows to make changes with confidence. Because the unit tests are introduced only to a small area of code, it's a much more feasible project, and the test coverage can get high enough to realize the many benefits of unit testing.
Following this process will eventually lead to nicely unit-tested code base of no-longer-legacy application - if new feature requests keep coming in. At the same time, no large up-front investment is required for a good outcome.
- Jane Prusakova's blog
- Login or register to post comments


Comments
Assuming that any project,
Assuming that any project, especially a particularly old one began its life under test is assuming quite a lot. That said, you should never discard tests, unless they become invalid during the course of development. If you are worried about tests increasing the size of your deliverables (and thereby load time), you can use scripts to leave your tests behind when
deploying (even in compiled languages such as C#, I do this myself)
I would say that if creating unit tests for legacy code is extremely difficult, then the code is probably poorly written (which is usually the case with legacy code). You could be looking at a whole host of problems: pure spaghetti code, bad naming conventions, tight coupling, etc.
If you are adding new features, at this point, you should make a decision as to whether or not you want to add the features to the existing codebase, or if you want to isolate the new functionality from the legacy code (this is not that hard to do, especially given the case of web applications).
If you are writing new code because a long dormant bug has emerged in your app and you are a maintenance engineer, you might have no choice as to whether to isolate new code from the old (I would still look for a way). In that case, the author's suggestion of creating a bunch of functional tests that exercise your app end to end (regression tests), is not a bad place to start. I also agree with the author's suggestion to write unit tests around a component you are going to be modifying. If you are going to be aggressively refactoring legacy code, I would vehemently encourage you to update your source code repository after each successful small change you make with all your tests passing. This will make it easy to back-pedal if you run into unforeseen issues.
So to sum up, isolate new from old if you can, and stay away from large commits when refactoring legacy code.