When evolving software products, we usually have to deal with legacy code. I mean, inherited code that isn’t covered by tests and that can even be difficult to understand. Even the most disciplined teams make mistakes from time to time. In this situation, our first impulse might be to redo the legacy part from scratch, but it is something we should avoid. We mustn’t ignore the risk of introducing new errors or overlooking pieces of code that we do not understand, but that are surely there for something.

So this time I wanted to share with you a tool that has always been very useful to me when dealing with legacy code: characterization tests.

What’s a characterization test

Automated tests are a very important tool, but not only for locating bugs (at least not directly). In general, these tests specify a goal or objective that we would like to achieve or a behavior that we want to preserve. In a natural flow of development, the tests you specify become tests that preserve.

In the case of legacy code, we may not have any tests that support us when making changes, so we have no way of verifying that, by touching something, we are preserving the behaviors we should.

Therefore, the best approach is to first create a safety net around the piece of code that we are going to modify to characterize the current behavior. We will call this a characterization test because it characterizes the actual behavior, not the ideal behavior that would be expected from the code.

Paper, rules and pen
Benjamin Smith on Unsplash

How to write characterization tests

Writing characterization tests is not specially complicated (beyond the added difficulty of testing legacy code). A possible algorithm to write this type of test could be:

  • Take the piece of code you are going to modify and use it in a test environment
  • Write an assertion that fails
  • Let the failure itself tell you what the actual behavior of the code is
  • Change the test so that the assertion corresponds to the current behavior of the code
  • Repeat until you get a suite of tests that makes you feel more comfortable

What we are trying to do is put in place a mechanism that helps us to find bugs later. That is, differences between the current operation of the system and the one it should have. These tests are not intended to document expected behaviors, but rather how the system is actually working.

If we find something unexpected or strange when writing this type of tests, it is important to take note of it. It could be a bug. We can mark that test as suspicious and work to understand if it is an expected behavior or not, and fix it. But we will do it now with the coverage and security of automated tests.

Another important point to keep in mind is that we are not writing black box tests. We can look at the code to try to understand how it behaves and how to characterize it through tests.

Tips when using characterization tests

Here are also some tips that can help you when deciding what and how to test legacy code:

  • Look for tangled pieces of code. If you don’t understand what a piece of code does, consider introducing sensing variables to help you characterize it. Use them to ensure that certain parts of the code are executed or what results they produce.
  • As you discover the responsibilities of a class or method, look for ways to create tests that force situations in which they might fail.
  • Think of the extreme and outlier values ​​of the inputs.
  • If you find conditions which should be true throughout the life cycle of a class, they are called invariants. Try writing tests to verify them.
  • Write tests for the fragments where you are going to make the changes and write as many cases as necessary until you feel that you understand the behavior of the code.
  • After the previous step, take a look at the specific changes you are going to make and write tests for them.
  • If you are trying to extract or move some functionality, write tests to verify the existence and connection of those behaviors.

Some final words

Dealing with legacy code is always tricky. And, of course, this type of tests only provide a safety net but are not infallible. Still, characterization tests can be of great help to us and prevent us from making changes blindly. Let’s also remember that the goal when dealing with legacy code should be to make it a little better, more manageable, and more protected, but not to redo it all over again.

I have not written a complete example for this case but, if you think it might be interesting, please ask in the comments and I will be happy to prepare it. My main goal was to let you know about the idea of ​​testing what exists and the current behavior (not the expected one). And no matter how badly we set up those tests, they are always better than nothing. In my case, it took me a lot of effort to put them into practice but once you start to get fluent you’ll see the great help and good results they offer.

Last, I wanted to acknowledge all this knowledge to Michael Feathers and his book Working Effectively with Legacy Code. I’ve learnt a lot from this book and I encourage you to take a look at it.

And, as always, I’ll be pleased to listen to your opinions and to read about your own experiences and tips!