Nixos-rebuild-ng: a nixos-rebuild rewrite

k0kada · December 4, 2024, 10:10am

Just merged nixos-rebuild-ng: implement the remaining missing features by thiagokokada · Pull Request #360215 · NixOS/nixpkgs · GitHub, once it hits the unstable branches we will have the first version of nixos-rebuild-ng with feature parity with nixos-rebuild . This means that most options that you expect in nixos-rebuild should be implemented and working:

--target-host
--build-host
- Yes, even “silly” things like using both at the same time
nixos-rebuild-ng repl

And of course things that already worked before, like nixos-rebuild-ng switch/boot/test/build/dry-build/dry-activate/list-generations.

--target-host and --build-host is implemented slightly different from nixos-rebuild, since they don’t allocate a pseudo-TTY (that in my tests causes more issues and headaches that is worth it). Instead, to ask for sudo password there is a --ask-sudo-password flag that will ask the remote sudo password at the start of the program, and injects the pasword at any remote sudo command via stdin.

This means that there is no more double sudo password prompt like it happens in nixos-rebuild, and the whole experience is more streamlined (e.g.: you don’t need to wait until the build finishes to type your password), but also there is also no error checking (we don’t validate the password, so if you type it wrong it will fail later).

Another big change in this release is implementation of proper logging. We now have logging.debug() information that should help debug issues. It is disabled by default but can be enabled using --verbose flag (that also enables verbose inside the nix commands).

The only feature missing is the _NIXOS_REBUILD_REEXEC to “update” in-place the nixos-rebuild-ng in case there is a newer one so we can have the bug fixes from it before switching to a new configuration. The feature is implemented but disabled from this release because we first need to implement the module changes so we can have config.system.build.nixos-rebuild point to nixos-rebuild-ng, that will probably be my next focus.

k0kada · December 4, 2024, 11:23am

BTW, forgot to thanks both @Scrumplex from helping me with reviewing my PRs and @R-VdP for testing and reporting issues.

aur3l14no · December 6, 2024, 1:26pm

@k0kada Cool! Have been using it, and I feel that when it’s ready, maybe the nix wrapper library can be extracted for other uses.
One quick question: is there a way to build on-site remotely? If not I’d like to give a proposal: skip nix-copy-closure when build_host == target_host. Useful for rebuilding a beefy remote machine.

k0kada · December 6, 2024, 1:44pm

If I understood your question, this should already work, except for the fact that yes it will call nix-copy-closure to copy between build_host and target_host (that will be the same in this case). It shouldn’t be a huge issue though since the call should finish really fast (since it will conclude that no copy should be made).

Now for implementing something new: I want to avoid any feature requests for now. The reason is because otherwise this will forever be chasing new use cases instead of focusing in replacing the current code. The most important thing right now is to get testers to exercise the different use cases of the original and fixing bugs that we find. Once the code is stable we can go back to focus in implementing new features (that is also the objective, one of the reasons of the rewrite is to make it easier to implement new features).

I am not sure if I want to try to make this a library. Most of the logic is pretty heavily inspired in the use cases of nixos-rebuild, and I few that if you want to do something else with it it is better to wrap nix yourself. But not sure, maybe I will change my mind in the future.

phaer · December 6, 2024, 2:52pm

I want to avoid any feature requests for now.

Understandably! What about features that might be added to the existing implementation soon? Would it be a good time to start a PR on build-image for the python implementation or shall we wait until it stabilizes?

Asking for nixos-rebuild: init build-image subcommand by phaer · Pull Request #347275 · NixOS/nixpkgs · GitHub

k0kada · December 6, 2024, 2:59pm

This is one of the reasons I added myself as nixos-rebuild maintainer recently: nixos-rebuild: add thiagokokada as maintainer by thiagokokada · Pull Request #362244 · NixOS/nixpkgs · GitHub. So yes, if you get something merged in the nixos-rebuild and have the know-how to port it to nixos-rebuild-ng, please do!

It would definitely help me focus in the more strutural PRs that needs to be done (like the changes to the module system), and if you can also give me feedback in what can be improved (e.g.: is updating the code/documentation easy?) I would be grateful.

k0kada · December 9, 2024, 3:01pm

Let me expand this comment because someone asked something similar in a recent PR.

Right now while using --build-host alice --target-host alice works, the implementation is definitely wonky: once the build finishes, we call ssh alice -- nix-copy-closure --to alice, that means that alice needs to be able to SSH’d itself.

Even if you’re using different hosts, let’s say, --build-host alice --target-host bob, this is still wonky because it means that alice needs to somehow be able to talk with bob, e.g.: if alice and bob are aliases in the current ~/.ssh/config, it means that alice also needs to have bob configured in the SSH config and credentials, contrary to what the user probably expects (i.e.: only having the config in their eval machine).

(Keep in mind that this is also the case of nixos-rebuild, since we use basically the same strategy).

Sadly, this is difficult to fix with nix-copy-closure, since it supports either --to or --from, not both. So right now I am thinking of two possible solutions:

Call nix-copy-closure twice, once nix-copy-closure --from alice and another nix-copy-closure --to bob. This would mean that the copy will probably take twice as long and we will need extra space in the eval machine (sufficient to keep the whole configuration)
Replace nix-copy-closure with nix copy, that supports using --to and --from together. It is still kinda sub-optimal because instead of having alice talking directly with bob, now the connection between the two seems to be done in the eval machine (I tested this with a Chromebook and it was really slow because the Wi-Fi connection in my Chromebook is poor), but not sure there is much we can do (and --use-substitutes still exist for those cases). This also would mean that any usage of --to and --from would need a newer version of Nix, and we try to be compatible to nix 2.3 since some folks still depends on it

Right now I am thinking of maybe checking which Nix version the person is using and using the first strategy if the user has an older version of Nix, or the second one if the user has a newer version. But if anyone can think of a better solution, I am accepting suggestions.

k0kada · December 12, 2024, 12:46pm

Just merged nixos-rebuild-ng: add module changes and port tests from nixos-rebuild by thiagokokada · Pull Request #363922 · NixOS/nixpkgs · GitHub. Once it hits the unstable branches, adventurous users that want to help testing can opt-in to replace their nixos-rebuild with nixos-rebuild-ng by using system.rebuild.enableNg = true;.

Note that I said adventurous users, I still recommend adding it to environment.systemPackages = [ pkgs.nixos-rebuild-ng ]; instead of system.rebuild.enableNg = true; for now, since this way you can use nixos-rebuild-ng side-by-side with nixos-rebuild. I am using system.rebuild.enableNg = true; for a few days though with success, but your milage may vary (especially if you’re using classic Nix instead of Flakes, since I don’t have non-Flakes systems to test).

This is also the end of the big PRs in nixos-rebuild-ng. I expect from now most PRs will be for bug fixes. There are also a few improvements that I am planning (like the usage of nix copy in place of nix-copy-closure), but I don’t expect big changes.

Another thing missing is testing how nixos-rebuild-ng behaves during installer. We have some installer tests that I didn’t port to nixos-rebuild-ng yet since I had some issues that I didn’t bother debugging too much (the installer tests are painful since they’re really slow to run). But this is in my to do list.

It is also a good time for people to take a look and suggest improvements. @phaer if you want to port build-image this is definitely a good time to do now.

k0kada · December 16, 2024, 10:13am

Merged nixos-rebuild-ng: change copy closure logic when copying from_host -> to_host by thiagokokada · Pull Request #364698 · NixOS/nixpkgs · GitHub. Once it hits the unstable branches, this will change the behavior when both --build-host and --target-host are passed to use nix copy --from build_host --to target_host instead of ssh build_host -- nix-copy-closure --to target_host. This should cover your case @aur3l14no, since when build_host == target_host, nix copy will just do nothing (this is better than trying to be smart about if build_host == target_host, since to do this correctly we would need in someway to parse the SSH configuration).

For users in Nix <2.18 (in the current nixpkgs this effectively only covers users of Nix 2.3), the code will fallback to use nix-copy-closure twice, once with --from build_host and another with --to target_host, meaning this will result in the copy being slowed down and the evaluation host also needing free space to store the copy of build result. But I don’t think this should impact many users.

Some other small improvements:

Paralelization of nixos-rebuild-ng list-generations: this command can get surprinsingly slow specially with lots of generations. When I had ~10, this change reduces from ~170ms to ~100ms. For comparison, nixos-rebuild list-generations will take up to ~1 second(!) in the same situation, so not bad in general
Removal of tabulate as dependency. It was bothering me that we were adding dependency in a library that is almost as complex as this whole program as dependency. With the help of ChatGPT, we now have 0 runtime dependencies again
Add support to --build-host for build-vm and build-vm-with-bootloader, something that I completely missed by mistake

For another announcement that I forgot to do, nixos-rebuild-ng now also has a proper manual (e.g.: not just copying the old manual from nixos-rebuild) and can automatically generate Bash/ZSH completion from the command line arguments.

The manual is generated from scdoc, kind like a Markdown but for man pages. Hope this drives up collaboration, since I find troff difficult to edit/understand, while the scdoc format is so small that I put an explanation of its syntax in a few lines of comment in the man page document itself.

justinas · December 23, 2024, 5:49pm

Perhaps it would be good to have an option to enable verbose logging for nixos-rebuild-ng itself, but not the forked Nix commands? nix-build -v can be quite noisy: evaluating a simple NixOS system using the verbose mode prints out 2746 lines of logs, only a few of which come from nixos-rebuild-ng.

Maybe this could be a -v versus -vv kind of deal.

k0kada · December 23, 2024, 6:17pm

We already have, use --debug flag.

waffle8946 · January 10, 2025, 1:34pm

Out of curiosity - since the entire python ecosystem breaks on staging merges, is there / will there be any failsafe in place to ensure that the rebuild command doesn’t break every 2 weeks?

clamydo · January 11, 2025, 9:47am

Should issues with nixos-rebuild-ng posted in the nixpkgs issue tracker or here?

The issue I’m having (with current unstable build): nixos-rebuild-ng does not accept multiple -I attributes. Only the last one is used.

Example:
nixos-rebuild-ng -I nixos-config=./configuration.nix -I nixpkgs=$HOME/.nix-defexpr/channels/pinned_nixpkgs build-vm

Only the last -I is respected.

k0kada · January 11, 2025, 11:01am

What breaks in the Python ecosystem? Because nixos-rebuild-ng has no runtime dependencies, so if the issue are Python packages breaking this is a non issue.

Edit: there is one build dependency on pytest, but if this becomes an issue I have a plan to convert all tests to unittest.

k0kada · January 11, 2025, 11:03am

Can you open an issue in the nixpkgs and tag myself? This should be an idea at fix but I want to track them.

Edit: opened a PR to fix: nixos-rebuild-ng: fix -I flag passed multiple times by thiagokokada · Pull Request #372919 · NixOS/nixpkgs · GitHub.

waffle8946 · January 11, 2025, 9:34pm

I just saw the python3Packages.shtab dependency in nativeBuildInputs, I’m not familiar with that package to know if it’s well-tested post staging merges or what its transitive dependencies might be.

EDIT: Now that I think about it, if nixos-rebuild-ng didn’t build due to some dependency issue, then I wouldn’t be able to deploy a broken version of the rebuilder. So I guess either way it’s not as much of an issue as I originally thought.

k0kada · January 12, 2025, 2:28pm

Forgot about shtab, but it is an optional build dependency for the shell completion. It can be disabled if you set withShellFiles = false in the package input.

waffle8946 · January 12, 2025, 2:29pm

Yes, I know how overrides work I was simply asking about failsafes within the python ecosystem. I’ll also note the default is on, so that package should be kept in good shape for the average user, especially if nixos-rebuild-ng becomes the default.

k0kada · January 12, 2025, 2:32pm

My main point is that if this becomes a problem, there are solutions because the core of nixos-rebuild-ng is designed to be dependency free. For example, if the shtab becomes an issue, we can just go back to manually writing shell completions. If pytest becomes an issue, we can convert tests to unittest that is included in Python’s standard library.

Also, those things are improvements compared to the original implementation, not requirements. For example, the original implementation generates shell completion manually and it is missing a few recent commands. The original implementation also has no unit tests.

waffle8946 · January 12, 2025, 7:20pm

I agree, I hope I didn’t come across as unappreciative of your work. Thanks for the improvements.