Legacy Coderetreat: Part 11 – Refactoring Rule of Three

Rule of Three

Blog post series

This blog post is part of a series about legacy coderetreat and legacy code techniques you can apply during your work. Please click to see more sessions about legacy code.

Purpose

The rule of three says:

Extract duplication only when you see it the third time.

This concept is extremely useful when you want to improve legacy code, refactor in the TDD cycle or just improve existing code that is covered by tests.

The Rule of Three brings higher coherence and more clarity to the system, because the duplicated code starts to be moved into specialized areas. In this way we optimize the code base for changeability.

The Rule of Three prevents from prematurely extracting possible duplication and defers the duplication minimization until we have enough proof.

minimize-duplication

Concept

We want to understand legacy code better. So we apply the concept of Divide et Impera: create more smaller functions, classes, packages, modules, etc.

If we have this very ugly code, we can see that some structures that can be extracted. We need to look at duplication of code. We can have duplication in: magic numbers, magic strings, constants, variables, code blocks, methods, classes, modules, etc.

The rule of three says that it is a good idea to extract these elements, with the purpose to minimize the duplication, only when you spot them three times.

Step 1: Identify partial duplication.

We have a code that we want to refactor. The first thing we do is try to spot partial duplication in the code.

In the following piece of code we have four usages of System.out.println, but with different parameters. That is partial duplication.

 Step 2: Make the duplication clear

In the following step we want to transform the code so that we have identical statements.

 Step 3: Extract identical duplication only when it’s repeated three times

At this moment we have the statement System.out.println(message) four times in the code. So the next step is to extract that to a method.

If we have had that duplication only twice, we would have needed to find another place where we can make the duplication clear. The Rule of Three says that we should not extract anything if there are only two duplications.

Step 4: Refactor as required

At this moment we have some code where the variables message are not needed anymore. We can inline the usage of them. The code will look like this:

When we use the rule of three, the refactoring becomes mechanical. We could have extracted the whole methods immediately, but often the IDEs would extract the whole statements. Because of that we could have dissipated the duplication and made it less clear, instead of minimizing it. If we extract specialized code in their own methods or classes, then the duplication would be between classes or methods. That kind of duplication is harder to see.

This is one very important case where The Rule of Three can help us minimize the refactoring mistakes. In this way we can make sure that we really minimize duplication, and we do not mask it.

We can have other cases like this also when extracting: constants, variables, classes, etc. During the next post we will enter in details on how to extract a class from legacy code.

Outcomes

If we use this rule of thumb, we can have a mechanical approach to refactoring. It can help a lot with improving the internal structure of the code because we do not need to wonder too much if it’s the case or not to refactor.

In my experience this way of working minimizes duplication faster than just trying to spot duplication from place to place. The reason in my view is that this technique brings more mechanics, and we need to focus less on when to minimize duplication. In this way we have more brain power to find out how to minimize the duplication.

Remarks

Even though the name of this concept is “The Rule of Three”, consider it as a guideline. Sometimes we need to minimize the duplication only when it appears 4 or 5 times. In the same time, not very often, it can happen that it is a good idea to extract the duplication only when it appears twice.

In order to use The Rule of Three, we first need to make duplication visible. So if we want to extract duplication, first make it clear, and only after that extract it. There is a clear danger to extract code that is not completely duplicated and introduce a defect in this way.

The Rule of Three is used extensively when doing Test Driven Development (TDD) as well. In that case we want to generate duplication, by adding tests, in order to remove it. Generating duplication becomes a proof that the design is heading in the correct way.

The usage of an IDE that has powerful refactoring tools is essential if we want to have fast and mechanical refactorings. The Rule of Three is one case where the IDE can help a lot.

Remember: refactor only when your tests cover the code and they are green.

The Rule of Three can be closely connected with The Four Elements of Simple Design. It can be used as a mechanical process of the second rule “Minimize Duplication”.

History

The Rule of Three was first formulated by Don Roberts and it was introduced by Martin Fowler in the Refactoring book.

Code Cast

Please find here a code cast in Java about this session

Acknowledgements

Many thanks to Thomas Sundberg for proofreading this blog post.

 

Image credit: http://fc09.deviantart.net/fs71/f/2012/263/5/0/this_is_how_i_looked_yesterday_for_six_whole_hours_by_marygriffith1743-d5fbqei.jpg

Subscribe

If you want to receive an email when I write a new article, subscribe here:

Subscribe for new articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation