Fixing the staging/staging-next workflow

Creating this topic to move the discussion around staging and staging-next workflow into it’s own thread, to avoid cluttering Marketing Team: Can we present Nix/NixOS better? with off-topic noise.

Problem statement

Our current staging and staging-next workflow is ill-equipped in detecting fundamental problems with our package set.

Example , anything to do with building a stage1 boot environment was broken.

Discussion topic

What are some realistic ways in which we can prevent such breakages from occurring on a “mainline” branch (e.g. staging, staging-next, master).

Additional context

Testing changes on staging or on a PR targeting staging is usually very painful due to the need to rebuild large amounts of packages in which to vet changes.
Although staging-next has a related hydra jobset dedicated to it, by the time a change is in staging-next, it is usually coupled with 50-500+ other changes which makes it difficult to determine causality of regressions.


In the case of, this removed all nixosTests from providing any useful validation. Also, the timeline for the fix caused the branch-off date of the 20.09 to be pushed several days.


We need a smaller jobset for staging and we should have a tested page for it (as we have for channels), showing the most important ones that need to pass.


There has been some discussion going on adjacent to #sig:sig-workflow-automation on how to implement a merge train with automatic bisection based on bors + adjacent tooling.

There is a draft RFC that prepares the groundwork for marking broken packages as broken = true and pro-actively coordinate a subsanation period to downstream maintainers.