Remote nixos-rebuild works with `build`, but not with `switch`

I’ve been trying to not clone my flake onto my Linux servers anymore, but deploy it from my home Mac directly with nixos-rebiuld. This works when I only build it:

$ nixos-rebuild build --flake path:.#gateway --target-host gateway --build-host gateway
nixos-rebuild build --flake path:.#gateway --target-host gateway --build-host gateway
building the system configuration...
warning: The interpretation of store paths arguments ending in `.drv` recently changed. If this command is now failing try again with '/nix/store/cdyygmpgsv4dnv745dmc6i7bn1ycqya0-nixos-system-gateway-23.11.20230812.f045184.drv^*'
warning: you did not specify '--add-root'; the result might be removed by the garbage collector

Slightly annoying that I have to type the hostname three times, but whatever.

But when I try to use switch, I get this puzzling error:

$ nixos-rebuild switch --flake path:.#gateway --target-host gateway --build-host gateway
/nix/store/m0pvnwa4pm7xcagazz0a7h8wg2vqi4lw-nixos-rebuild/bin/nixos-rebuild: line 378: /nix/store/8fdd0nqajq5sk1m6p4qnn0z0j9d7n3q5-coreutils-9.3/bin/mktemp: cannot execute binary file: Exec format error

I know that Exec format error means that it’s trying to run a Linux binary on my mac (or was it x86 on aarch64?), but why does nixos-rebuild download an incompatible coreutils to my Mac when I run this command? And how can I work around this?

1 Like

Aha, so I can apparently use --fast to work around this, which gets me further:

$ nixos-rebuild switch --flake path:.#gateway --target-host gateway --build-host gateway --fast
building the system configuration...
warning: The interpretation of store paths arguments ending in `.drv` recently changed. If this command is now failing try again with '/nix/store/9qb61nzi2sbbgi0v0ym4x7k9wlggdkbf-nixos-system-gateway-23.11.20231026.808c0d8.drv^*'
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
error: creating symlink from '/nix/var/nix/profiles/system-4-link.tmp-21884-1279440096' to '/nix/store/md62l2f4ndnqfg99dwpllrn0nd459bki-nixos-system-gateway-23.11.20231026.808c0d8': Permission denied

I know about --use-remote-sudo, so let’s try that:

nixos-rebuild switch --flake path:.#gateway --target-host gateway --build-host gateway --fast --use-remote-sudo
building the system configuration...
warning: The interpretation of store paths arguments ending in `.drv` recently changed. If this command is now failing try again with '/nix/store/9qb61nzi2sbbgi0v0ym4x7k9wlggdkbf-nixos-system-gateway-23.11.20231026.808c0d8.drv^*'
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required

Uh. What? Is that feature just broken? Or am I holding it wrong?

Use NIX_SSHOPTS=-t to enter a sudo password when using nixos-rebuild. Or, configure sudo to not require a password (using something like pam_rssh or carefully configured passwordless sudo).

1 Like

NIX_SSHOPTS=-t is still a bit flaky. This thread is filled with people for whom it doesn’t work for some reason: Remote nixos-rebuild: sudo askpass problem

I never did dig into why it’s broken. deploy-rs doesn’t fix it either, it’s some quirk of ssh+sudo.

1 Like

Sorry for replying so late. As @TLATER suspected, this doesn’t work for me:

$ NIX_SSHOPTS=-t nixos-rebuild switch --flake path:.#gateway --target-host gateway --build-host gateway --fast --use-remote-sudo
building the system configuration...
warning: The interpretation of store paths arguments ending in `.drv` recently changed. If this command is now failing try again with '/nix/store/asr3a3qgckpi764354p43jxdh0bj2glw-nixos-system-gateway-23.11.20231026.808c0d8.drv^*'
Pseudo-terminal will not be allocated because stdin is not a terminal.
Shared connection to 111.111.111.111 closed.
zsh:1: bad pattern: ^[[35
zsh:1: bad pattern: 1mwarning:^[[0m
Shared connection to 111.111.111.111 closed.

After the “Pseudo-terminal will not be allocated” message, it stopped, so I waited for a few seconds and then pasted my password. I specifically ran this from a local bash shell, but it seems the issue originates from the remote shell being zsh, which is a potential issue mentioned in the thread you linked.

I’ll try to run it with bash on the target host at some point as well, maybe that solves it. Either way it’s a usability issue. I didn’t see this reported on GitHub, so I might write up a bug report myself once I have time again.

1 Like

I’ve been having this issue with a remote bash and local zsh.

@iFreilicht I use

  security.sudo.wheelNeedsPassword = false;

to workaround this problem.

Yeah I really don’t wanna do that, though, the security implications of that are kinda iffy :grimacing:

1 Like

Slightly safer workarounds live in the thread I linked.

Do wonder if something could be done to fix the actual bug.

Ok so with both those workarounds together, it works well:

$ nixos-rebuild switch --flake path:.dotfiles#gateway --target-host gateway --build-host gateway --fast --use-remote-sudo
building the system configuration...
warning: The interpretation of store paths arguments ending in `.drv` recently changed. If this command is now failing try again with '/nix/store/5v9lsbbchzvnmcnhdbvpa8m3mpcmblyw-nixos-system-gateway-24.05.20240417.2e359fb.drv^*'
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
Shared connection to 111.111.111.111 closed.
Shared connection to 111.111.111.111 closed.
updating GRUB 2 menu...
activating the configuration...
setting up /etc...
reloading user units for felix...
restarting sysinit-reactivation.target
Shared connection to 111.111.111.111 closed.

It also works without setting security.sudo.wheelNeedsPassword = false;, but then I have to paste the root password 3-4 times in a row, which is somewhat annoying.

I guess I’ll just live with this slightly insecure setup for now.

1 Like

@iFreilicht @TLATER Is this issue with the shared connection dropping twice before actually deploying to the remote host a known issue and tracked somewhere? Very weird issue, and super annoying if you do not want to go with the insecure setup of using the root user for deployments.

I’d recommend security.pam.sshAgentAuth instead, personally. That doesn’t require disabling passwords for wheel.

It’s tracked in these two separate tickets so far:

But yeah, looks like there has been no progress in fixing it for at least 4 years.

1 Like

@TLATER I have set security.pam.sshAgentAuth.enable = true on the remote server now, but when i use the command nixos-rebuild switch --flake .#server --target-host shania@<public-ip> --use-remote-sudo it still asks for password of the shania user

Can you ssh to that server without password when not using nixos-rebuild? Your ssh client needs to be configured to allow ssh agent forwarding with ForwardAgent. If that doesn’t work, check the ssh manual, use -vvv to see detailed errors on your client, and check the server-side logs.

yes, i can ssh there without password when not using nixos-rebuild

And you authorized your client key? If you have, check server-side logs, nothing much I can do besides asking if you’ve configured pam correctly without access to your server, sadly.

ah, i have nothing set there. Do i need to put my public ssh key there as well?

Yeah. If you’ve configured your key with the appropriate user setting the defaults should work.

@TLATER I have set those two settings on my remote machine:

users.users.shania.authorizedKeys.keys = [ <my-public-key> ]
security.pam.sshAgentAuth.enable = true;

I have also set this:

services.openssh.settings.PasswordAuthentication = false;
users.mutableUsers = false