NixOS-rebuild on GCE VM

Hi all,

I’m trying to go Terraform + flakes for a “NixOS VM on Google Cloud” deployment.

I use this command to build a NixOS raw image from a flake:

nix build --no-link --json "..#nixosConfigurations.wireguard-gateway.config.system.build.googleComputeImage"

The image is then used in Terraform when spinning new VMs. This works perfectly.

The problem is that if I make any changes to the system’s nix config, this results in a rebuild of the image and the re-creation of the VM (because the NixOS image has changed), which is very cumbersome. What I want instead is to be able to run:

nixos-rebuild switch --target-host user@host --build-host localhost --flake ..#wireguard-gateway

Running the above fals with error: creating symlink from '/nix/var/nix/profiles/.0_system' to 'system-1-link': Permission denied

Running the above while appending --use-remote-sudo fails with sudo: you do not exist in the passwd database.

I tried running both as my normal user and as root.

So it looks like a user permission problem? The machines are configured with OS Login, which might add some complexity into the mix. But I can definitely SSH into the machines, both via direct ssh and via gcloud compute ssh.

Here is the flake.nix:

{
  description = "Foo infrastructure";

  inputs = {
    nixpkgs.url = github:NixOS/nixpkgs/nixos-20.09;
  };

  outputs = inputs:
    let
      system = "x86_64-linux";
      pkgs = inputs.nixpkgs.legacyPackages.${system};
    in
    {
      nixosConfigurations.wireguard-gateway = inputs.nixpkgs.lib.nixosSystem {
        inherit system;

        modules = [
          ./nix/configuration.nix
        ];
      };

      devShell.${system} = pkgs.mkShell {
        nativeBuildInputs = with pkgs; [ jq terraform google-cloud-sdk ];

        PROJECT_ID = "project-foo-bar";
      };
    };
}

And this is the configuration.nix:

{ config, lib, pkgs, modulesPath, ... }:

{
  imports = [
    (modulesPath + "/virtualisation/google-compute-image.nix")
  ];

  services.openssh = {
    enable = true;
    passwordAuthentication = false;
    allowSFTP = false;
  };

  networking = {
    nat = {
      enable = true;
      externalInterface = "eth1";
      internalInterfaces = [ "wg0" ];
    };

    wireguard.interfaces.wg0 = {
      ips = [ "10.1.1.2/24" ];
      listenPort = 51820;
      generatePrivateKeyFile = true;
      privateKeyFile = "/root/wireguard-private.key";

      postSetup = ''
        ${pkgs.iptables}/bin/iptables -t nat -A POSTROUTING -s 10.1.1.0/24 -o eth1 -j MASQUERADE
      '';

      postShutdown = ''
        ${pkgs.iptables}/bin/iptables -t nat -D POSTROUTING -s 10.1.1.0/24 -o eth1 -j MASQUERADE || true
      '';

      peers = [
        {
          # lorenzo
          publicKey = "gTqmM3TXHUAunBn59SJdKs9sDn0pMaPXdaFJXO3wxQM=";
          allowedIPs = [ "10.1.1.3/32" ];
        }
      ];
    };
  };

}

nix-info output:

❯ nix-shell -p nix-info --run "nix-info -m"
this path will be fetched (0.05 MiB download, 0.28 MiB unpacked):
  /nix/store/qgbwdnk91rk26b5bkd6qv5r6c2v733kb-bash-interactive-4.4-p23-dev
copying path '/nix/store/qgbwdnk91rk26b5bkd6qv5r6c2v733kb-bash-interactive-4.4-p23-dev' from 'https://cache.nixos.org'...
 - system: `"x86_64-linux"`
 - host os: `Linux 5.4.99, NixOS, 20.09.20210316.6557a3c (Nightingale)`
 - multi-user?: `yes`
 - sandbox: `no`
 - version: `nix-env (Nix) 2.4pre20210308_1c0e3e4`
 - channels(root): `"nixos-20.09.3346.4d0ee90c6e2"`
 - channels(asymmetric): `""`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

Would appreciate any help!

PS: Previously, the nixos-rebuild command would fail with another error, which mentioned the lack of a valid signature. Not sure why, but it went away. This had to do with nix.trustedUsers, can be ignored.

1 Like

I wonder if this has something to do with how OS Login keeps track of accounts.

There is no corresponding entry in /etc/passwd for the user I’m SSHing with (that’s the whole point of OS Login), and this might break when using --use-remote-sudo, maybe because (forgive the handwaving) the OS Login PAM modules are not being used?

Interestingly, if I run, on my laptop, the same command that nixos-rebuild eventually runs, namely

ssh wireguard-gateway sudo -- /nix/store/l856yh3syw494jgdz48bibf1rwqvj417-nixos-system-unnamed-20.09.20210317.12d9950/bin/switch-to-configuration switch

Then everything works fine.

So the sudo: you do not exist in the passwd database error only happens when using nixos-rebuild.

1 Like

I also encountered the same error trying to deploy similar configuration to google cloud instance (with os-login enabled) via deploy-rs.

@tailrecursive I created an issue for this. If you have further information, maybe you can post it there?

Are you also seeing the nscd crashes?

1 Like