Friday, October 7, 2011

Working Effectively with Legacy Code

A lucky developer is one who works on a new or green-field project. Unfortunately, maintaining/enhancing existing projects is the norm and in my new job I'm maintaining a couple of these existing "legacy code" projects. So now seemed like the right time to get to a book that has been on my list for a while: Working Effectively with Legacy Code by Michael Feathers.

Legacy Code has many definitions:
Code inherited from someone else
Code with a tangled, unintelligible structure
Code that's difficult-to-change
"Rotten" code

The author's definition is simple and somewhat unique: code without tests.
Code without tests is bad code. It doesn't matter how well written it is; it doesn't matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don't know if our code is getting better or worse.
To this I would add that legacy code is code that was written without any thought to how it will be tested.

In the forward, Robert Martin provides a nice summary of the book:
It's about taking a tangled, opaque, convoluted system and slowly, gradually, piece by piece, step by step, turning it into a simple, nicely structured, well-designed system. It's about reversing entropy

But, given the author's focus on tests, the book is all about how to change code so that it can be tested. The book is thought provoking and provides the techniques that can be used to "understand code, get it under test, refactor it, and add features."

A few points that resonated with me:
Preserving existing behavior is one of the largest challenges in software development. Even when we are changing primary features, we often have very large areas of behavior that we have to preserve.

Dependency is one of the most critical problems in software development. Much legacy code work involves breaking dependencies so that change can be easier.

What are the problems with big classes? The first is confusion. When you have 50 or 60 methods on a class, it's often hard to get a sense of what you have to change and whether it is going to affect anything else. In the worst cases, big classes have an incredible number of instance variables, and it is hard to know what the effects are of changing a variable. Another problem is task scheduling. When a class has 20 or so responsibilities, chances are, you'll have an incredible number of reasons to change it. In the same iteration, you might have several programmers who have to do different things to the class. If they are working concurrently, this can lead to some serious thrashing...

The later chapters are a bit hard to digest in isolation. I've found it best to keep the book handy as a reference when I encounter some legacy code that presents a specific problem.

No comments: