I’ve debugged my issue already but I wanted to create this thread to hopefully save others from hours of debugging. Maybe there should be a lint/test/assertion for this in nixpkgs?
It all started when I went to install my working NixOS config onto a new system, for which secrets hadn’t been provisioned yet. I ran nixos-install but got this weird output:
< other output ... >
/nix/store/vqn6z63pwzkg6whhkzji5jdd16kflid6-sops-install-secrets-0.0.1/bin/sops-install-secrets: failed to decrypt '/nix/store/<one of my secrets>': Error getting data key: 0 successful groups required, got 0
/nix/var/nix/profiles/system/sw/bin/bash: line 12: /run/current-system/bin/switch-to-configuration: No such file or directory
At first, I disregarded that second-to-last line, because it’s completely expected for the secret key to not be provisioned yet and the system should still boot fine without it. However, after debugging the missing switch-to-configuration for a while, I realized that /run/current-system was missing entirely and that the activation script should’ve created it but hadn’t. The activation script was bailing on the sops-install error, but it shouldn’t, as there’s no set -e at the top of the file only an ERR trap that recorded the status for the exit code at the end of the script. Except I had added my own activation script:
system.activationScripts.createMyWgKey = lib.mkIf (!hasWireguardKeySecret) {
deps = [ "users" ];
text = ''
set -euo pipefail
...
I begin almost all my scripts with set -euo pipefail, so writing it here was second nature to me. But what I didn’t realize is that “activationScripts” is not a set of independent scripts but instead snippets which are concatenated into a single activation bash script. My “set -e” caused the later sops-install failure to crash the entire system activation script, causing nixos-install to fail on the missing current-system!
Now I know that you have to be very careful not to modify the script environment in activation scripts, but this seems like a very easy way to cause obscure failures, so I thought I’d create this topic to warn others and also provide something to search for in case “switch-to-configuration” can’t be found by nixos-install for someone else in a similar situation.