Today is my last working day at AndroidPIT. After building AndroidPIT and developing it for ten years, a few months ago, I decided to start my own business as a freelance programmer to gain new experiences.
After programming alone for a few months back in 2009, we started hiring additional programmers shortly after that. Working in a team was different because suddenly developers with different programming styles and experiences were working on the same code. In code reviews, we spent much of our time unifying the code style and fixing similar types of bugs.
Reason enough to check whether this work can somehow be automated.
What are the challenges in detail?
Maybe you know the problem: There are no code style guidelines in your development team – or there are, but not every developer adheres to them. As a result, you need more time to understand another developer's code, and much of a review is "wasted" adapting the code to the style guidelines.
Modern IDEs can format code automatically, but every developer on the team has to configure this feature properly.
Another problem is that developers in teams have different levels of experience in areas such as software craftmanship and clean code. Besides, everyone makes mistakes, regardless of their experience. Consequently, program code may be faulty, hard to read, and poorly maintainable.
Most development teams have no security experts. Thus program code often causes security holes in the application, which can lead to severe data breaches. Even if security experts are present, they are only human and can overlook errors, or face such a large codebase that they can check it in detail only with extremely high effort.
In this series of articles, I explain:
The classic approach to solving these problems is regular code reviews. These are at best part of the "Definition of Done," i.e., no task is done until a second developer has checked the associated code.
However, code reviews, crucial as they are, have the following drawbacks:
Code reviews can, fortunately, be automated quite well. The technique is called "static program analysis" or "static code analysis."
For static code analysis, numerous excellent open-source tools are available that address all the problems mentioned above. I will present them in detail in the third part of this series.
Once the appropriate tools have been set up, they can check the source code extremely quickly and give the developer numerous recommendations for improvement – regarding coding style, potential errors, bad practices, poor maintainability, and potential security gaps.
Static code analysis refers to the analysis of software without executing it. The analysis is done by automatically examining the entire source code according to a set of pre-defined rules and then notifying the programmer of the rule violations found.
Most static code analysis tools can be integrated into the development environment as plug-ins and highlight rule violations directly in the source code. IDE integration is a powerful feature, as the developer receives immediate feedback about possible vulnerabilities or bad practices during programming. If these are found later in the development cycle, the correction is much more time-consuming and thus more expensive.
Likewise, static code analysis tools can be integrated into automated build processes, generate reports and alerts, and – depending on the configuration – cause the build to fail.
Static code analysis is thus a beneficial, automated code review process. Nevertheless, it can not replace manual reviews. First, static code analysis can't detect 100% of all errors (it doesn't even know the functional requirements of the software), and second, it's essential to share knowledge about the code within the team.
Manual code reviews are tedious. The great strength of static code analysis lies in the fast and automatic checking of the entire codebase without the need to execute the code, therefore significantly reducing the effort required to detect problems in the code.
Manual code reviews involve developers. Automation takes the pressure off developers and allows them to focus more on the development of the software.
You can seamlessly integrate static code analysis into the Continuous Delivery process. It can, therefore, be performed entirely automatically and regularly. If a tool is extended to detect new problems, you can immediately detect those problems in the entire code base.
The earlier you find a defect, the lower the cost of its elimination. By integrating static code analysis tools into the IDE, problems can be detected and fixed very early, during programming. Developers are notified of this directly in the code – along with explanations and suggestions for improvements.
By automating the process, you can check the complete code – even code passages that the developers rarely see. Besides, static code analysis tools can analyze and verify all execution paths of an application, including those not covered by tests. Human errors are possible when configuring the tools, but not when executing them.
Ultimately, all of the benefits mentioned above lead to better code and product quality at a lower cost for the entire development project. The continuous delivery of secure, reliable, and maintainable software enhances the reputation of the developers and the company at which they work.
Static code analysis tools must be evaluated, studied, installed, and configured. However, these costs usually pay for themselves in just a few weeks. This series of articles should help to minimize preparation time and costs.
When you apply static code analysis tools to existing code, you may see thousands of problems, usually resulting in developers simply ignoring these messages. Therefore, a rollout strategy is required. It is best to prioritize the problem types and initially display only the ones with the highest priority. Only when you've entirely fixed these, will the next most significant category be displayed and handled.
Style guide violations or specific error patterns can be reliably detected. Security problems – for example, in the authentication process, newly discovered security holes in external libraries, new attack patterns or incorrect configuration outside the source code – are challenging to find. Errors in the implementation of concurrent code that can lead to race conditions are also difficult to detect by static code analysis.
Occasionally, static code analyzers mark correct code as incorrect. Such a false positive can happen when a tool is "insecure," e.g. when the integrity of input data is not verifiable, or the application interacts with closed source components.
While this article series focuses on static code analysis, I would like to distinguish it from dynamic code analysis in at least one sentence: Unlike static code analysis, which checks program code without executing it, dynamic code analysis checks code during its execution. Examples for this are unit testing and profiling.
In this article, I have described software development challenges that are traditionally solved by code reviews. Since manual code reviews are complicated and expensive, they can be supported by tools for static code analysis.
In the next article, I will explain what types of static code analysis exist and how they address these challenges. I will, in particular, answer the following questions:
In the third and final part of the series, I will present the most relevant free Java tools for static code analysis in detail.
If you know someone else who might be interested in this article, then as always I am pleased if you share it using one of the following buttons.