Commit messages - Are we happy with them in nixpkgs?

andir · October 25, 2020, 1:47am

I noticed more and more commit messages that are not entirely fitting with our CONTRIBUTING.md section on writing good commit messages. In there we state that they should not just contain a brief headline (that is just a copy of the what has been done) but also a bit of information on the reasoning, cross references and such. The obvious exception are trivial package bumps. Very often I see this information only in the PR descriptions.

Personally I prefer having at least as much detail in the commit messages as there is in the PR description. The SCM history is useful for figuring out why a change was made and potentially what lead to that decision. It is also a good place to state trade-offs that have been made etc…
Having access to those information from git log or often also in combination with git blame is very powerful. If I had to jump to the browser for every commit that was made to figure out if that is related to what I am looking for that would be a disaster. Worst case the PR would also not contain much detail but was simply waved through.

Whenever I’m reviewing contributions I have been trying to encourage writing proper commit messages. This also extends to the structure of a PR. IMHO we shouldn’t have 20 fixups in a branch for 20 typos.

During a review that I did today a contributor decided to close their PR instead of adding meaningful context (that was only vaguely obvious from the PR description). This got me thinking about the whole topic…

My impression has always been that the guidelines described in CONTRIBUTING.md are the baseline we are aiming for. You can always do better. You can occasionally do worse (I’m guilty of that) but in general you should aim for that and/or above.

Is this no longer inline with what the community see as good practice? Should we enforce this better? What is your opinion? What can we do to improve?

bhipple · October 25, 2020, 2:44am

Having good commit messages is great. For me, I write the descriptive commit msg in git, then use hub to create the PR from the cmdline. This is the best of both worlds for info, since it pre-fills the PR description with the git commit body.

jonringer · October 25, 2020, 5:08am

good commit messages help a lot when doing git bisect or reviewing history in general. I think it’s a pretty low bar for the benefits it provides.

FRidh · October 25, 2020, 8:29am

@andir you are right here. Commit messages are very useful, and should at least follow the contributing guidelines.

Sometimes people forget, sometimes it did not strike them as important but when pointed out they make the change, and sometime they refuse to improve the commit messages. If contributors are not willing to follow those base guidelines, then I don’t think we should mind not having their contributions.

I guess here what matters is the attitude towards git history. Some say history should never be changed. I am of the opinion the history in a tiny little PR is of irrelevance to the bigger Nixpkgs project, and worse, is just noise.

NobbZ · October 25, 2020, 9:23am

Personally I just drop the short headline in the commit message for version bumps or adding packages.

Usually there is no reasing to the commit other than bumping a version or adding the package, other than bumping/adding.

Though for stuff that is more involved, I try to give the same reasoning in commit message and PR text, perhaps rephrased a bit, as I still try to keep things brief in the commit message.

danieldk · October 25, 2020, 10:19am

I fully agree with you @andir , I have often had to hunt between git history and various PRs/issues to figure out the rationale of a change.

Unfortunately, this also extends to PRs themselves. Often new package PRs have an empty Motivation section. Then before reviewing you first have to figure out what the PR actually adds. It would be so much easier if people added a short blurb + a link to the website.

danieldk · October 25, 2020, 10:40am

Some reviewers told me to put in a link to the changelog entry on version bumps (when available), so I started doing that. Though CONTRIBUTING.md states that no additional information is necessary in the case of version bumps:

For package version upgrades and such a one-line commit message is usually sufficient.

7c6f434c · October 25, 2020, 10:45am

It would be so much easier if people added a short blurb + a link to the website.

Isn’t it called meta? Why put into history mess what should be there in the actual code?

danieldk · October 25, 2020, 10:49am

I was talking about the Motivation section of the PR. There are certain ‘new package’-PRs that I am not in to review (e.g. blockchain-related). Now one needs to go to the PR and then to “Files changed” and then scroll to meta to know if the package is interesting.

andir · October 26, 2020, 12:48am

Related to this topic I was just reading the GDB Contribution guide and saw this section about their expectations of how a commit message should look like:

https://sourceware.org/gdb/wiki/ContributionChecklist#Detailed_Explanation_of_the_Patch

While I think this is written pretty explicit and on point I see that we couldn’t just adopt it 1:1. Maybe rephrasing our section to make it a bit clearer and less “open” could help?

worldofpeace · October 26, 2020, 1:26am

I will second that we need to document that for updates contributors actually have to explain/understand and audit somewhat what changed in an update. Same with reviewers.

andir · October 26, 2020, 3:27am

Yeah. What would a good requirement be? Just asking them for the changelog in the commit and/or a link to the changes?

How feasible is it to verify those commit messages via Regex? We can’t automatically judge if the commit message actually contains the proper content.

Maybe we could have something like this @timokau’s marvin bot guide contributors when it encounters a package version bump without description?

worldofpeace · October 26, 2020, 3:52am

I think there’s a big thread somewhere where me and @jtojnar on auditing changelogs as the workflow of upgrading packages. I can’t find it atm, but it basically goes over some really good requirements IIRC that would be useful to this thread. We’ve for sure seen that without that you cannot even review an update properly, and once you look at a changelog you see all the mistakes.

7c6f434c · October 26, 2020, 6:06am

Yeah. What would a good requirement be? Just asking them for the changelog in the commit and/or a link to the changes?

Then of course there are different upstreams w.r.t. changelogâ¦

I think for many upstreams nobody is realistically qualified to fully read their changelog quickly; maybe encoding Â«these tests mean working good enoughÂ» would be useful sometimes.

Then there is kernel where brief changelogs could be called lying (you never now how security critical what is).

I guess hard changelog policy would prevent the systemd issue in the latest release by making it infeasible to find out quickly enough all that systemd changes in releases?

How feasible is it to verify those commit messages via Regex? We can’t automatically judge if the commit message actually contains the proper content.

Some upstreams with high impact and good changelogs maybe should even be labeled as such and specifically for them we could require quoting changelog in commit messages or in PR discussion with automatic enforcement? Up to fetching upstream changelog in parallel from their VCS web browser and search for keywords from it, dunno.

Maybe we could have something like this @timokau’s marvin bot guide contributors when it encounters a package version bump without description?

For packages with meta.changelog it could even be cool to auto-pastethe head of it with link to more into the discussion?