Ibizaman's Blog - Bisect Experiment at the PR Level

mightyiam · May 13, 2026, 9:54am

samestep · May 13, 2026, 12:09pm

Self-plug: this post mentions hydrasect, but I’m curious whether the author might find use in my tool npc which I built for pretty much the exact use case they’re describing: GitHub - samestep/npc: Nixpkgs channel history CLI · GitHub

npc has various npc bisect subcommands, although it doesn’t currently have a fully automated npc bisect run subcommand like git bisect run; it shouldn’t be too hard to add that though, so maybe I should just do it.

ibizaman · May 13, 2026, 4:56pm

Your project is exactly what I hacked with some bash script. You should definitely add it to the wiki so it can be found. Although to be totally honest I didn’t search that much.

Note though that I do not need a git bisect run command. Instead, I would be using the start/good/bad commands. Maybe I didn’t explain well enough how this works. I got an idea, a timeline should make it much easier.

Say we start at midnight and the workflow controlling the bisection is running every 3 hours

00:00 Workflow starts. There is no PR yet. It clones the repo and does a nix flake update nixpkgs then creates a PR with it.
00:05 PR is created with new flake.lock and checks start.
00:15 Checks are done and there is a failure.
03:00 Workflow starts. It sees its PR is opened and has failures. It then bisects between the top of my main branch and the commit in the PR. It then pushes that update to the PR.
03:05 Checks are done and there is a failure.
06:00 cycle continues…

So you see the git bisect run is not used directly and is done through the cron job. And the checks are run by the actions in the PR and not in the workflow doing the bisect. This was done intentionally to reuse the existing machinery.

Using your tool I would need to cache the nixpkgs repo as well as the state. I’ll try it

samestep · May 14, 2026, 12:51am

Glad to hear it’s relevant! I’ve added a link on that wiki page.

Another possibly-relevant comparison point: for the flake that contains my own personal NixOS and Home Manager configs, I have a nightly GitHub Actions workflow that first updates flake.lock, then builds all the different configs to make sure there are no failures, and only then creates a PR. Then every morning I wake up, see an email from GitHub about the PR that got created automatically, and tap Merge on my phone.

So sometimes things are broken for a few days in a row; in that situation, there are simply no PRs that get created until the breakage is fixed. If I understand correctly, the effect is fairly similar to your approach: the locked Nixpkgs commit is always close to the most recent working one, since it was being updated every day up until failures started happening.

Let me know if I misunderstood a key aspect/benefit of your approach, though!

ibizaman · May 14, 2026, 9:12am

Yes, I have also that same behavior with the PR failing multiple times in a row. But usually what blocks me “forever” are changes to upstream modules which I need to adapt to. These of course always need manual intervention to be fixed. For example this test I have for Nextcloud checks interaction between SSO, LDAP, and multiple Nextcloud apps so it is bound to fail at some point.

I see one difference though, in my case the tests are run in the PR that got created. I’m doing that because I’m never (anymore) pushing to the main branch directly and always open a PR which runs a sizeable amount of tests. So I thought or reusing that instead of having them run upfront.

samestep · May 14, 2026, 12:14pm

Yeah I think a downside of my approach is that I have to basically duplicate all the GitHub Actions YAML from the Build workflow into the Update workflow; whereas in your approach, you only need one workflow to run the tests and it just runs on pull requests.

ibizaman · May 14, 2026, 4:33pm

You could use yaml imports to get around duplicating but getting it right can be finicky