Code Quality - Amazon CodeGuru's Machine Learning to the rescue?

Code Quality – Amazon CodeGuru’s Machine Learning to the rescue?

August 4, 2020

Jakub Slawinski

Business Problem

What are the real costs of poor IT software quality?

In short, the cost of poor quality software in the US in 2018 is approximately $2.84 trillion

according to the Consortium for IT Software Quality (CISQ).

How can this number be calculated? It is not a trivial matter, as usual, it is easier to point out the visible consequences of poor IT software quality:

Customer problem reports
Customer service calls
Lawsuits/warranty claims
QA & test department costs
Service outages

However, there are also multiple hidden or hardly visible ones:

Finding & fixing internal problems/defects
Canceled and troubled projects
Unaccounted overtime (crisis mode)
Waste and rework
Successful cyber attacks
Staffing problems (e.g.turnover)
Poor teamwork
Lack of good planning
Dubious project value/ROI
Excessive systems costs
Lost market opportunities
Lack of good practices & quality standards
Understanding complex code
Technical debt
Poor quality data

The above list has been taken from the mentioned CISQ report.

The real cost of software bugs is a combination of money, time, and reputational damage. If the defects are found early, the costs are minimized. If the defects are found on production and affect thousands of customers, the costs might be huge. According to the IBM report from 2008, the cost of a bug found in post-production release is 30 times more expensive than if it was found during the design phase. Moreover, it is not only a matter of money. Sometimes even critical-mission software has vulnerabilities and thus the bugs might cost human lives.

There are many techniques that can help to improve the quality of the produced software. Let’s focus this time on two the most relevant to software development: Static Code Analysis and Code Reviews. How to use these techniques effectively?

Solution

Static Code Analysis is performed by automated tools like PMD, Checkstyle, Findbugs, or SonarQube. The analysis could be initiated by developers ad hoc (from the command line or IDE addons like QAPlug), but usually are somehow burdened into CI/CD build pipelines. There are also many SaaS platforms that provide such analysis, like SonarCloud, Codacy, Snyk, Travis, or many more.

Analysis reports for the whole project are rarely effective, as the number of found issues (usually minor ones) are often overwhelming. It is much convenient to analyze individual pull requests and look only on new issues and trends (the feature called Quality Gate – fail the build if it does not pass your quality requirements).

Code review is a software quality assurance activity in which one or several people (reviewers) check a code, mainly by reading its source code. Although the direct discovery of quality problems is often the main goal, code reviews are usually performed to reach a combination of goals:

Better code quality
Finding defects
Learning/Knowledge transfer
Increase a sense of mutual responsibility
Finding better solutions
Complying to QA guidelines

However, manual code reviews might become boring and tiresome after some time, which might lead to passing more and more code smells. Moreover, sometimes the project timeline and budget pressure do not allow to spend as much time as needed to keep the required level of quality.

The perfect solution for the above mentioned situation is to combine Code reviews with the Static Code Analysis. This not only saves the time of the reviewers but also increases the quality of the product by not allowing bad patterns and bugs to be accepted. Moreover, it also gives more opportunities for developers to grow, as the automated tools usually give a lot of contexts, rationale, and suggestions for the found issues.

Despite many players in the code quality industry, Amazon announced a new service in the re:Invent conference in Las Vegas in December 2019. This new AWS’s automated code review and profiling tool is called CodeGuru.

Amazon CodeGuru Profiler helps developers find an application’s most expensive lines of code along with specific visualizations and recommendations on how to improve code to save money.

Amazon CodeGuru Reviewer uses machine learning to identify critical issues and hard-to-find bugs during application development to improve code quality.

CodeGuru Reviewer is the direct competitor of the previously mentioned platforms. According to Amazon itself, its reviewer recommendations have been based on decades of knowledge and experience, machine learning, best practices, and hard-learned lessons across millions of code reviews. The following kinds of recommendations are provided:

AWS best practices
Concurrency
Resource leak prevention
Sensitive information leak prevention
Common coding best practices
Refactoring
Input validation

Is machine learning from Amazon CodeGuru Reviewer better than other existing tools?

As with most of the AWS services, CodeGuru Reviewer is easy to set up.

The first step is to associate the repository.

Associate Repository

CodeGuru Reviewer supports associations with the following repositories:

AWS CodeCommit
Bitbucket
GitHub
GitHub Enterprise Server

Making the association is straightforward. You have to choose the source provider, select/configure the connection, and select a single repository from the available ones.

When the associated repositories are configured, we can go to the code reviews section.

It contains two tabs:

Pull request
Repository analysis

The pull request code reviews are automatically done by the Amazon CodeGuru whenever a new pull request is created. Moreover, the comments and recommendations are also added directly to the pull request in the repository.

The Repository analysis reports are done for the whole branch in the repository and have to be manually requested.

Codereviews info

We tested Amazon CodeGuru on multiple projects in a different stage of maturity. One of the first things to notice is the surprisingly small number of recommendations for each project. This is especially intriguing compared to the results provided by other competitors (SonarQube, Codacy, command-line/IDE static analysis tools). Moreover, the majority of these recommendations were related to code duplication and only a few of them pointed to some potential resource leaks. It is definitely disappointing after hearing promises of AI and Machine Learning improving the code review process in a new/better way, but at the same time getting only a false illusion that the software is of high quality.

Last but not least, currently Amazon CodeGuru supports only Java, which might be a huge limitation for most of the companies.

Result

Amazon CodeGuru is still in the process of getting traction. It was announced in December 2019 and has been actively developed since then. Although it is not as mature as other tools, the promise of using Machine Learning and AI to do automatic code reviews and propose customized recommendations sounds really interesting. However, the current verdict is that it is too early to use Amazon CodeGuru Review at the moment, but it might be valuable to track this service development and try it in the future.

Delivered with Passion by SolDevelo – a Solutions Development Company

About

SolDevelo is a dynamic software development and information technology outsourcing company focused on delivering high-quality software and innovative solutions. An experienced team of developers, customer-oriented service, and the passion for creating the highest quality products using the latest technology are the undeniable advantages of the company.

Using Atlassian products since 2009, SolDevelo always strives to exceed customers’ expectations. ISO 9001 confirms our dedication to the highest quality and ISO 27001 shows that we treat security extremely seriously. Over 70% of our team are certified Scrum Professionals, over 35% are Oracle Certified Professionals and 100% of our quality assurance team has ISTQB certificates.

Related posts: