I opened LinkedIn for a moment today and saw an old colleague had liked a post which was basically; "No one likes reviewing massive PRs". More discouraging, 90% of the comments agreed, with some being as useless as "LGTM" or "Looks good to me" which is a common response to a PR. With the implication that the dev simply looked at the PR, saw the scale, and rubber stamped it without getting too deep.
Honestly, I'm a bit shocked that someone is willing to post this publicly and that others are perfectly happy to agree and go along with it publicly.
I've personally written a decent number of absolutely massive PRs. The problem is not the size of the PR, but how you approach reviewing it.
PRs are (typically) only massive for one of three reasons:
- The PR is bad and should be rewritten
- There is a low level change which cascades through the codebase
- There is a large, new feature
What is, I think, most important to note here is that all three of these ALSO represent some of the most important times for PROPER code reviews.
In my experience, that first scenario is least common and also the quickest to pick up on (and then reject). If it isn't a new feature then the work should most likely be either concise or redundant. If it is concise, no problem. And if it isn't redundant then the commit is most likely attempting to solve an overly broad issue or is simply bad code. In which case you can likely reject it without reading the whole thing.
I can (and have) scanned through a PR with several hundred file changes in under 10 minutes. It was a few low level changes that cascaded through the solution. Each set of modified lines in each file was either identical or had a class name changed out between files. You'd be amazed at how quickly and efficiently the human eyes and brain are at locating inconsistencies. An extra whitespace here or there would catch my eye.
I would review each unique set of changes once and then quickly scan for any deviation. If something changed, I would review that to see if it was a new change or the dev being inconsistent.
If it was still massive and there was no way around it, I'd reach out to the author, ask for a quick run down (in their own words) of what they did along with which files/changes they thought were most important. This helps give the PR more focus. I would then thoroughly review those plus some selected changes at random and skim the rest.
And that is the art of the code review; to be able to figure out how to efficiently review code. Yes, I can spend more time per change on 1-20 lines vs 2000 lines of change. But, bailing on a PR because it is long simply defeats the point. As I said, they are the most important because:
- Bad PR means bad practices and/or bad code. Obviously, you don't want to merge in bad code.
- Low level changes have the broadest ability to break a system.
- New code is most likely place to introduce bad habits and most likely place to contain new bugs.
If THESE are the places where you're simply tossing in a "LGTM" you're a negligent liability.
Large PRs need different considerations than smaller ones. It is likely, in many cases, that these larger PRs were necessary and necessarily large. Risk in coding does NOT grow linearly as the amount of code in a commit increases along an S-shaped curve with the risk growing very slowly after first, then rapidly growing after a certain size before slowing back down again.
For a smaller PR, say 1-50 lines, it is perfectly reasonable to carefully scrutinize every line. You're unlikely to find bugs, but perhaps likely to find things to optimize or consistency issues with the rest of the codebase.
Beyond that, you need a way focus on what matters. I mentioned above what I do for more foundational changes that cascade out but for a feature, I'll spend a LOT of time scrutinizing test cases and one or 2 key methods. I might select a few more files at random to put a decent amount of effort into. Then the rest, I'll be comfortable skimming over (not SKIPPING over).
I'm coming from a place where most PRs come from a single developer at a single point in time. If this isn't you, then the approach needed may vary.
In my case, the general quality of the code from one file to another won't differ much regardless of the experience of the developer in question. Spending minutes per line would be a waste.
Carefully reviewing a few files will generally tell you if there are issues. If I hit a lot of red flags early on I'll just highlight those and kick it right back and refer them to the standards/style guide/etc... If they get past that, then I'll try and think of ways to break it and make them fix the holes and cover them with unit tests.
This feels like a "who watches the watchers" problem. We bring in PRs to improve code quality but I've never seen a company retroactively validate the work of the people doing the PRs. When a bug goes to production, if a PR happened, that person who approved the PR is MORE to blame than the person who wrote the code. Especially if it is an obvious bug.
Comments
Post a Comment