Having minimum code duplication is one of the Four Principles of Simple Design. In this article, we look at why we should worry about code duplication, and what are the best practices to follow to reduce it.
You will learn
- What Is Code Duplication?
- Why Is Code Duplication Bad?
- How do you measure Code Duplication?
- How can ensure that Code Duplication standards are Adhered to?
Free Courses - Learn in 10 Steps
- FREE 5 DAY CHALLENGE - Learn Spring and Spring Boot
- Learn Spring Boot in 10 Steps
- Learn Docker in 10 Steps
- Learn Kubernetes in 10 Steps
- Learn AWS in 10 Steps
This is fourth article in a series of 6 articles on important code quality terminology:
- 1 - What Is Technical Debt?
- 2 - What Is Readability Of Code?
- 3 - What Is Code Complexity?
- 4 - What Is Code Duplication?
- 5 - What Is Code Coverage?
- 6 - What Is Legacy Code?
What Is Code Duplication?
Same block of code repeating multiple times is the most basic form of Code Duplication.
Why Is Code Duplication Bad?
Suppose there is a change needed in one of the places where a code block is used. When the change is made, it needs to be replicated at all other places where it occurs. If a code block of 40 lines occurs at 10 places in your application, there is a chance that a developer modifies it only in 8 places out of the 10. The code is now potentially broken!
When you have code duplication, maintaining the code becomes very difficult.
How To Measure Code Duplication?
There are static analysis tools such as SonarQube that measure code duplication as part of estimating technical debt:
As you can see, code duplication is mentioned under “Duplication” at the bottom. It gives an idea of the percentage of duplicate code in the project.
Typically, you have some extent of duplication in your code. A general measure of controlled duplication is a limit of 5%. A project having less than 5% of code duplication is considered very good.
It is important to continously evaluate code duplication and identify improvements.
Looking At Code With Duplication
Have a look at the following source file, from the project which we ran SonarQube on, previously:
< ParenthesizedTreeImpl.java >
You can see that the duplicated block is highlighted by a bold-grey vertical line. We can click on it to see the duplicated blocks:
SonarQube indicates that SynchronizedStatementTreeImpl.java, SwitchStatementTreeImpl.java and ParenthesizedTreeImpl.java have the same code block duplicated across them.
Avoiding Code Duplication
Make Use Of Inheritance
The simplest solution for the problem at hand is to define a super class, and define the duplicated blocks as code within its methods.
Define A Utility Class Or Method
This is the alternative available when reuse by inheritance is not the option that works.
The catch is that there is no single solution! It depends on the duplicate code block, and the context in which it is being used.
Do check out our video on the same topic:
In this article, we had a close look at code duplication. We explored why code duplication is not good, and what are the ways in which it can be detected and measured. We also had a brief look at how to avoid it.