Code Reviews–What are we looking for?
A few years back I wrote about the issue of when to perform code reviews. Code reviews which happen late in the development cycle can be a waste of time, since it may be too late to act on the findings. However, there is another, much bigger, problem with code reviews, and that is the issue of what are we supposed to be looking for in the first place?
At first this might seem to be a non-issue. After all, the reviewer is simply looking for problems of any sort. But the trouble is that there are so many different types of problem a piece of code might contain that it would take a very disciplined reviewer to consciously check for each one. It would also take a very long time. The reality is that most code reviewers are only checking for a small subset of the potential issues.
So what sorts of things ought we to be looking for?
I put this one first as it is the one thing that almost all code reviewers seem to be good at. If you break the whitespace rules, variable naming conventions, or miss out a function or class comment, you can be sure the code reviewer will point this out.
However, these are things that a good tool such as ReSharper ought to be finding for you. Human code reviewers ought to be focusing their attention on more significant problems.
Other things an experienced developer will be quick to pick up on are when you are using the language in a sub-optimal way. In C#, this includes things like correct use of using statements and the dispose pattern, using LINQ where appropriate, knowing when to use IEnumerable and when to use an array, and making types immutable where appropriate.
Observations like this are particularly useful to help train junior developers to improve the way. But again, many of these issues could be discovered by a good code analysis tool.
In a large system, keeping code maintainable is as critical as keeping it bug free. So a code review should look for problems such as high code complexity. I almost always point out a few places in which the code I review could be made “cleaner” by refactoring into shorter methods with descriptive names.
Uncle Bob Martin’s “SOLID” acronym is also a helpful guide. Are there bloated classes with too many responsibilities? Are we constantly making changes to the same class because it isn’t “open to extensibility”? Are we violating the “Liskov Substitution Principle” by checking if an object is an instance of a specific derived type?
Noticing maintainability problems at code review time, such as tight or hidden coupling through the use of singletons, may require some substantial re-work of the feature, but it will be much less painful to do it while it is still fresh in the developer’s mind rather than having to sort the mess out later on in the project lifetime.
Every software project will have some kind of organization and structure, and in an ideal world, all developers are aware of this and make their changes in the “right” place and the right way. However, with large projects, it can be possible for developers to be unaware of the infrastructure that is in place. For example, do they ignore the error reporting mechanism in favour of re-inventing their own? Do they know about the IoC container and how to use it? Did they keep the business logic out of the view code?
For issues like this to be picked up, the code reviewer needs to be someone familiar with the original design of the code being modified. It may be necessary on some occasions for the code reviewer to ask the developer not to copy an existing convention because it has been discovered to be problematic.
Most discussions of code reviews assume that finding bugs is the main purpose, but from the list of items I’ve already mentioned it is not hard to see how we could actually forget to look for them.
This involves visually inspecting the code looking for things like potential null reference exceptions, or off by one errors. A lot of it is about checking whether the right thing happens when errors are encountered.
The trouble is that if the code is highly complex or badly structured, then it may be close to impossible to find bugs with a visual inspection. This is why the “maintainability” part of the code review is so important. Giray Özil sums this problem up brilliantly:
“Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good.”
If the code reviewer doesn’t have a firm grasp of the actual requirements that the coder was trying to implement, it is quite possible that code that simply doesn’t meet the requirements gets through the code review. The more understanding a code reviewer has of the actual business problem this code is trying to solve, the more likely they are to spot flaws (or gaping holes) in its implementation.
Now we move onto some areas for review that relate to specific types of code. The first is user interface code. This adds a whole host of things for a reviewer to check such as correct layout of controls, correct use of colour, all localisable text read from resource files, etc. There will likely be some established conventions in place for user interface, and for a professional looking project, you need to ensure they are adhered to.
There is the also need to review user interfaces for usability. Developers are notoriously bad at designing good UIs, so a separate review with a UX expert might be appropriate.
Multi-threaded code is notoriously hard to get right. Are there potential race conditions, dead-locks, or resources that should be protected with a lock? It is important that a code reviewer is aware of which parts of the code might be used in a multi-threaded scenario, and give special attention to those parts. Again, the code needs to be as simple as possible. A 4000 line class that could be called on multiple different threads from dozens of different places will be close to impossible to verify for thread safety.
A good code reviewer should be asking what has been done to test the code under review, including both manual and automated tests. If there are unit tests, they too ought to be code reviewed for correctness and maintainability. Sometimes the best thing a code reviewer can do is suggest additional ways in which the code ought to be tested (or made more testable – the preference should always be for automated tests where possible).
One of the most commonly overlooked concerns in code reviews is security. There are the obvious things like checking the code isn’t vulnerable to a SQL injection attack, or that the passwords aren’t stored in plaintext, or that the admin-level APIs can only actually be run by someone with the correct privileges. But there are countless subtle tricks that hackers have up their sleeves, and really your code will need regular security reviews by experts.
This is another commonly overlooked concern. Of course, much of our code doesn’t need to be aggressively performance tuned, but that doesn’t mean we can get away without thinking about performance. This requires knowledge of what sort of load the system will be under in “real-world” customer environments.
Another big issue particularly with large enterprise systems that the code may need to run on all kinds of different operating systems, or against different databases. It might need to be able to load serialized items from previous versions of the software, or cope with upgrading from the previous version of the database. If these problems are not spotted in code review, they can cause significant disruption in the field as customers find that the new version of the software doesn’t work in their environment.
Where possible, automated tests should be guarding against compatibility breakages, but it is always worth considering these issues in a code review.
If you are in the situation where you need to actively develop and maintain multiple versions of your product, then a code reviewer needs to consider the impact of these changes on future merges. As nice as it might be to completely reformat all the code, rename all the variables and reorganize all the code into new locations, you may also be making it completely impossible for any future merges to succeed.
Looking at the list above you might start to think that a code reviewer has no hope of properly covering all the bases. For this reason I think it is important that we clarify what a particular code review is intended to find. There may need to be special security, performance or UI focused reviews. I also think we should be using automated tools wherever possible to find as many of these issues for us.
I’d love to hear your thoughts on this topic. Which of these areas ought a code review to focus on? Have I missed any areas?