I have a NixOS configuration that shows the warning when evaluated:
nix-repl> nixosConfigurations.example.config.system.build.toplevel
evaluation warning: system.stateVersion is not set, defaulting to 24.11. Read why this matters on https://nixos.org/manual/nixos/stable/options.html#opt-system.stateVersion.
«derivation /nix/store/5ri44cy5z3ps8m213gkwrwafwv3wglsh-nixos-system-example-24.11.20250313.cdd2ef0.drv»
But when I examine the config in a repl I see that system.stateVersion is set:
I have tried using NIX_ABORT_ON_WARN=true and --show-trace, but the trace is massive and seemingly doesn’t really help with identifying what is being evaluated. I wonder if there’s a better way to identify the source of this warning.
For example what @eblechshmidt did by showing me how nixos-option can be used :D. I wasn’t aware of that tool. That’s useful. Any hints at how Nix configuration can be debugged in this case would be useful.
You know, one could say that this is kinda like teaching a man to fish instead of giving him fish :D.
Indeed. I will continue looking through it, but I wonder if anything else could be done. I seem to recall that there was a way to initialize repl after a crash. Not sure if it would help in this case but maybe.
I’m not an expert but it seems to me like repl in that case only helps when looking up elements of the tree UP, but it’s bad at looking down the tree of evaluation. But maybe I was using it wrong.
@waffle8946, you are absolutely right. Somehow the repl shows a different version (“23.05”) than the “defaulting to” one (“24.11”). That is why I though it might be interesting to see why this is.
I have to agree with @waffle8946 here. It is extremely hard to teach you how to fish if you refuse to show us if you try to use a fishing rod or a net. If you follow this discourse a bot you will notice that providing some kind of minimal configuration that reproduces the problem is the minimum needed for other people to help.
Of course you’re right that if I don’t provide a config it’s harder to help. But I disagree that nothing can be done if there’s no configuration. I think any problem should be debuggable by anyone as long as the tooling makes it possible. If Nix tooling makes it impossible to see where this warning comes from there’s something wrong with the tooling that could be improved, and I’m simply here too early =].
One thing I thought of is maybe using a local fork of nixpkgs with modified nixos modules and the part that actually triggers this warnings to get more info. What do you think about that?
To be slightly more accurate, it’s set in the default value of the option:
And the warning comes from a check to see if the value has priority 1500, i.e. default option priority:
Since a types.str option cannot be defined twice with the same priority, and only the highest priority matters, this implies it’s otherwise unset, and nixos-option will only return the result pointing to this very module, i.e. I don’t see it giving us new info.
To identify where the error is coming from. For example, as far as I know it doesn’t indicate currently if it’s from a container or the host operating system. That alone could be a big improvement to this warning. But I’m actually not sure yet how this could be achieved here. I’ll try reading this part and around a bit more to see if it’s possible.
Yes, this is why my first thought was to assume that it’s a container… but there’s no containers.
Which suggests to me that there’s some other instance in which the system configuration is evaluated without that stateVersion part. The trace is in like 95% generic lib files like attr.nix, and the rest means nothing in this context as far as I can tell.
I’m going to look into Nix builtins to see if there’s something I can do in the place of the warning to indicate where it is. Or at least differentiate if it’s a container or actual host system.
Sure, but that option isn’t the only place the warning could come from. It could be from some specialisation or as they mentioned from a container, or some other location where that module is imported via a new module system evaluation.
Improving the error would, to my knowledge, require passing in more information into the submodules (if I had to do it, it’d be via a module arg). @jakubgs if you want to look into improving the error that could be one approach.
Yeah, I’m looking at it right now. The file has access to config, so I could at the very list add printing of some info of the system being evaulated. Even printing config.system.name could be useful.
This is a general problem in the module system, the generic solution of passing line/file information and more trace history down the module stack has been proposed and shot down many times as the evaluation cost would be too high. There has been discussion on big module system redesigns that could help, but have other tradeoffs.
Your PR may well be accepted/help a little, but it doesn’t really solve the problem properly. Personally I don’t think it should be added as a workaround for one specific error, instead you should adapt your debug strategy.
IME the easiest way to debug this is to comment out module imports and re-add them bisect style until I find the culprit.
Yeah, i get that, and Nix could simply have a debug build of the interpreter that you use when you do want to sacrifice the performance for debugging power.
I do agree that in a perfect world the warning would indicate what was evaluated, but this kinda difficult as we are essentially warning on a negative. Not that something is set but that it’s not set. Which means there’s no specific line or file to show, since it’s the whole thing that’s at fault.
I do think my PR is a major improvement, since currently the warning is just silly. When evaluating a configuration for a fleet of servers you get flooded by “there’s an issue, good luck finding it sucker” warnings all over the place. Not a great user experience.