Code Review: But what do I look for?
CR Part II: Functional correctness, design and best practices, readability
In the previous installment, we considered the code review process using the Six W's:
Why: To find flaws in the code, share knowledge among engineers, and support long-term maintainability.
Who: Everyone who writes code
What: All code should be reviewed
When: Often and for short sessions
How: Using a lightweight approach with a quick turnaround time
We also introduced some communication tips to keep things flowing and maintain harmony.
In this installment, we'll shift gears and focus on what to look for when performing a review. I intended this to be the second half, but it went longer than I expected and needed to be split. This installment looks at functional correctness, design and best practices, and readability. We’ll have a third installment that will cover other categories before finally moving on.
Now let's dig in. As we do so, remember that different projects have different requirements, and you don’t have time to go through every PR with a fine-toothed comb. Pick the most relevant, or the most interesting, questions. Whatever works for you.
Functional Correctness
We should ensure that the code does what it is supposed to do. Walk through the logic and see if it makes sense to you. Consider asking yourself a few specific questions.
Does the code handle edge cases?
Think about possible edge cases that might arise. What happens if this code encounters them? Does the logic still hold?
Is the code robust against errors?
Think about possible sources of error. What happens if errors are thrown? Does the code react well?
Think about unknown unknowns. If a dependency acts unexpectedly, or if an exception is thrown in a random line of code, will the system recover appropriately?
I once worked with a caching layer, sitting right above a database. It included a feature flag to dynamically disable caching for particular objects. Something like:
if (FeatureFlags.DisableCacheFor.includes(name)) {
// Ignore caching logic and make a database call
}
Now DisableCacheFor
presented itself as an array of strings, and everyone treated it as if it was an array of strings, but the property actually ran some logic every time it was accessed. One day when we changed the underlying DisableCacheFor
value, this logic began throwing exceptions. The exceptions bubbled up to the caching layer. From that point on every request into the caching layer, every request for database information, threw an exception. This included requests from the admin application that we needed to revert the DisableCacheFor
value!
Everything went down. For 45 minutes. Eventually, a very hesitant database administrator made a direct edit to our production SQL database table to get DisableCacheFor
reverted.
The postmortem discussion for that one was fun.
Design and Best Practices
We should ensure that the code is designed well, follows software engineering best practices, and matches the coding standards or conventions of the team. Consider these questions.
Is this the best place for the change?
Perhaps this code belongs in a different class, a different file, or a different section of the same file. Perhaps a new file belongs elsewhere in the folder structure.
Is it consistent with the current structure and style of the code?
Inconsistent code structure will make the code harder to understand over time.
Could the change have reused existing components?
This is a really common one. If the author wrote a function or logic that seems broadly useful, it may have already been implemented elsewhere. Util files are notorious catch-alls for this kind of thing.
Does the change introduce duplication?
Similar to the previous question, but in this case the author reimplemented (or copy/pasted) logic that should be separated out into a shared function.
There was once a company whose app included a section that displayed a bunch of graphs. There were maybe five classes involved in rendering that page. At some point in their history, they needed a second, similar section that displayed a slightly different bunch of graphs. What did the dev do?
They copy/pasted all five classes and made minor edits. Later they needed a third page with slightly different graphs. Copy/paste again.
By the time I got there, there were six different copy/paste versions of those classes. Each stack was slightly different from the others. The area was rife with bugs because there was no way to update the shared functionality in one place. You had to update each of the six stacks individually. Often one stack had a unique quirk that escaped notice. Sometimes a stack was missed entirely.
It was a mess.
Are the functions small, and is the logic as simple as possible?
Keeping the functions small and simple makes the code easier to understand, maintain, and refactor later. It also makes for easier deduplication and code reuse.
Does it add complexity for an unlikely future?
I’m a fan of YAGNI (you aren’t gonna need it), a principle that you shouldn’t implement things you think you might need in the future. Predicting the future is hard and requirements often change over time.
Consider a PR that introduces a new concrete class that extends a new abstract class. Now there are two likely possibilities here:
The author knows that they will soon add other concrete classes that extend the abstract class.
The author thinks that they might eventually need to add other concrete classes that extend the abstract class.
#1 is totally fine. The author is planning for a known event that will happen in the short term.
#2 is very questionable. The author is introducing additional complexity that might come in handy later (but probably won't). Better to just have the one concrete class, abstracting out behavior later if you find you do need a second concrete class.
Readability
This can be difficult for the author to recognize since they have spent significant time working with this code. Even convoluted code will make sense if you are the one who wrote it! As the reviewer, you should consider anything confusing as a potential maintainability problem.
Can I understand the code by reading it?
This is very subjective but still worth it. Remember that there is a good chance that in the future the author will have left the company and others will be charged with maintaining this code. Maybe you!
Do the names reflect the thing they represent?
Aside from proper and consistent naming conventions (eg. firstName
, first_name
, szFirstName
), you should be able to know what a variable contains or a function does without having to look up the declaration.
Consider these potential variable names that might refer to the same string[]
ar
bList
bagelList
availableBagels
availableBagelTypes
availableBagelTypeArray
The right choice depends on context. Super shorthand might be fine in a lambda.
bakedGoodsByCategory.forEach(ar => updateAvailability(ar));
A small function might have some implicit context.
function setAvailable(bagelList) { ... }
In order places a long, expressive name is best.
Are exception and logging messages understandable?
Often just as important as readable code is the ability to debug in production.
Are confusing sections sufficiently commented?
For sections of code that can’t be simplified, comments better explain what it does.
Lots of Practice
This installment covered some specifics of what to look for when code reviewing, or at least for the specific categories of functional correctness, design and best practices, and readability. In the next installment we’ll conclude this topic by examining some other categories.
If you have any fun code review stories, or other questions you like to ask when reviewing, let us know in a comment or email reply!