How do I review the PRs?
As a lead software engineer at the startup I work in, most of my time at work is spent on reviewing peer PRs. More than the person who is reviewing the changes, it is the "review" process which has significance in maintaining the code quality. People often misunderstand this fact. Development and review process are two different areas which can only be observed by two different lenses. There is less significance for the skill set of the individuals that are involved. It takes lot of efforts and time for one to switch the roles between developer and reviewer. That is the reason, even the team lead sends the PRs to peers, because it is difficult to take a step back and look at it holistically when you are doing low level implementation.
The code review process itself consists of multiple steps. You can consider them as multiple scans of the entire change, where as each scan has a purpose.
1. The title and description of the PR
Personally, I prefer the PRs to be reviewed in the absence of the author. In the world where remote work is proven to work, all the communications and operations (at least in software development) has to happen asynchronously. The title and the description of the PR plays main role in it. It should be quite evident of the changes that the PR carries just by reading the title and descriptions. The set the agenda of the changes.
2. Lines of change
Subconsciously my eyes roll to this section as soon as I open the pull request. I have few thumb rules about the lines of change of the code of a PR. Any change which carries +- 100 lines of code is a big change. Anything less than 40 is a concise one. Generally, I look at the lines of change and decide to review it instantly or to do it later. It requires to set up the mind to review any big changes. I strongly suggest the developers to break down the changes into smaller, independent changes so that the review process can be done faster and will less risk. More lines of code, more risk.
3. Files changed
With quick top to bottom scroll, I can see all the files or the modules that are being changed. Even without looking at the actual change of the files, most of the times I identify the issues just by looking at the files that are involved. For example, let us say you are adding a new way of fetching the user records from the database. For this particular change, you don't have to involve any of the other other layers of the application other than user layer. If so, it clearly means that something is wrong in the change.
4. Module or layer dependency
If there are multiple modules are layers involved in the change, it is very important to look at the dependency structure. Taking the previous example, let us say you are trying to add a feature to set the author (user) for a post. Assuming the post module already depending on the user module, if you make the user module depend on the post module because of any reason, it is a clear conflict. It means that the developer has not understand the relation between the modules. Even though this type of changes work without any issues, they are going to create significant issues in very near future.
5. Naming and types
Naming is the core part of the programming. I believe in the philosophy that if you cannot name something clearly, you did not understand it fully. It is very important to name functions, classes, variables properly without any ambiguity. Few points on top of my head
- Lists should always be plural
- Functions names should be verbs or they should represent actions
- Class names should be nouns
- Grammar matters. It is not "is_contain", it is "does_contain" and so on.
- Enums should always be singular
Types are as important as naming. If there is a function "get_name" it should return string or an entity called Name. It should not return List of strings. Use generics wherever possible. As the popular saying, we should write code for humans but not machines. Let the code be self explanatory. Comments should be last stop.
6. Domain knowledge
There will be very few changes which does not involve the business logic. Often, the reviewer does not have all the information about the changes that are being done. It is important for the reviewer to seek review from product team who would be having the domain knowledge. In fact, the code should be in a way that even non technical people can understand the high level changes just by reading the function/class/variable names. For instant, due_date = now().replace(7) + 3 * 24 * 60 * 60 (vs) next_sunday = now().replace(7); DAYS_3 = 3 * 24 * 60 * 60; due_date = next_sunday + DAYS_3. The second approach is easy for people to understand.
7. Tests
Almost never approve a code change which does not carry the tests which can tell the logic is intact against future changes. Tests provide the confidence on the system. Your team can confidently make any change if there are multiple tests failing for any change they do. Any fix should absolutely carry a fix that would have failed and passes with the new changes. Assertions should never be removed without a strong and valid reason. It is also important to keep updating the test cases otherwise we end up in a mess of legacy test cases that no one understand at later point of time.
8. Be empathetic
I emphasis this topic as it impacts the culture of the team. The reviewer should be composed and be gentle not only in the verbal communications, but also in terms of the tone of the comments on the PR. There is no point of making same comments again and again just because the developer made the same mistake at multiple places. Instead, it is better to ping the developer separately and make them understand. The reviewer should also be open to send the code changes to peers irrespective of the levels and should be accepting the feedback. Smaller but healthier team produces quality output than larger but messed up team.