Using hashes for `stateVersion` instead of human-readable strings

This has been answered already, but I think there’s a simpler way to address it: what happens when one of those default values changes?

The postgresql default data path changes at some point, I upgrade to a new nixpkgs, and suddenly my database has gone! help!

This is what stateVersion captures: the default path on my system doesn’t change, because the module sets its default depending on stateVersion.

If I have to materialise all those defaults in my config, I have a lot more config, and a lot of it is boilerplate default values and hides my intent. And I have to migrate that stuff for config schema changes, or nixpkgs needs to keep config scheme backwards compatibility. There’s a worse burden either way.

That’s also why I’m wary of the idea of multiple module-specific versions: they need to be in my config and something has to add them when I pull in a module. I think we can explore the idea further, but it’s going to need to be automated somehow: something similar to a lock file, and perhaps checks and automated migrations that can be applied to advance each one? I’m not sure it’s worth it, but it would need to be something like this to be even workable, let alone worthwhile.

I do like the simpler idea of changing it to be something that doesn’t have the user hazard of looking like a version number that needs to be updated. Even a unix timestamp might be opaque enough as an integer, and even better in hex.

7 Likes

Yes, this seems like the only way to avoid it causing bloat, but it will have other issues. A whole bunch of users are going to copy foo.bar.enable = 1234 from some random blog post somewhere, and then wonder whether they need a different number from a different blog post when trying to diagnose any (probably unrelated) issues they have, change it, and get even more confused.

2 Likes

That does seem to be the simplest solution to the original question.
Bonus points that it’s already been tested on the nix-darwin side.

With stateVersion not based on fixed release versions, how often are merge conflicts because of stateVersion an issue?

You mean, merge conflicts from two PRs that want to bump the maximum stateVersion? That hasn’t happened, but we’re a much smaller project than NixOS. I think it happens rarely enough in NixOS that it shouldn’t be too big a deal, but a policy of bumping the version once per release could be adopted, anyway; it’d still help as long as it’s something more opaque than a release version.

1 Like

You misunderstood me.
In my scenario there wouldn’t be any default values for these things.
It would be up to the user to change them.
At maximum we would provide examples.

As TLATER pointed out there needs to be an abstraction at some point but stuff like paths to the DB data and the versioned package could easily be offloaded to the user.
In part we are already doing that since it is recommend to pin you DB packages to a specific version.

I think we misunderstood eachother… the first part of my post talks about current behaviour, describing what happens when defaults change, for clarity.

The paragraph that starts with:

is addressing your proposal, and includes a related issue about what happens when (effectively) the template that was originally copied instead of having defaults is invalidated for other reasons.

1 Like

The integer solution is alluring but if it is considered for Nixpkgs then I would like to see some consideration and policy regarding third-party modules. Indeed, I would like to see it for nix-darwin as well (cough LnL7/nix-darwin#584 cough).

A potential solution would be to unconditionally increment the state version counter whenever a new release cycle starts. I still don’t like this since you have no way of knowing what the counter will end up at when the next release cycle starts which makes it difficult for a third-party module developer.

Thus, I personally still prefer having the state version match the release version. Potentially with an additional version component that can be incremented as needed during a release cycle. That is, one would start the cycle at 25.05.0 (or 25.05-0) and increment to 25.05.1, 25.05.2, and so on. Then third party modules still can target the two component state version and perfectly predict a future valid state version (assuming a consistent XX.05, XX.11 release cycle). It should also be backwards compatible with the existing state version schema allowing the continued use of convenient functions like lib.versionAtLeast.

3 Likes

I just looked at nixpkgs and most uses of stateVersion are just setting a package. There are exceptions like swraid, supybot, radicale, … (though radicale uses it incorrectly) but we can get rid of many uses of stateVersion within a year.

Why? It’s an impl detail of NixOS, not other projects.

And it’s absolutely a bad idea to have it match release versions, the whole beginner confusion comes from that. Whatever we change, we must at least change that.

If it must still be monotonically increasing, make it an integer string and start at say, "30" or higher.

5 Likes

It seems like most uses of stateVersion in NixOS fall into the following categories:

  1. Module foo has changed the default value of an option; this shouldn’t affect existing configurations.

    Practical example: The default value of virtualisation.oci-containers.backend was changed from docker to podman.

    Theoretical example: services.openssh.openFirewall currently defaults to true, unlike most instances of openFirewall, for backwards compatibility. If this were to be changed in the future, it would need to be gated by stateVersion, otherwise any existing systems relying on that default would lose SSH access.

  2. Package foo has been renamed upstream to bar. It should continue to be available to as pkgs.foo in existing configurations to avoid breakage, but new configurations should start using pkgs.bar.

    Practical example: pkgs.codimdpkgs.hedgedoc

  3. Package foo has changed it’s (database format/storage directory/other state) since a previous version, and we are able to automatically run a migration script if stateVersion is old enough to be applicable.

    Practical examples: services.sourcehut, services.timesyncd

  4. Package foo cannot be updated automatically. Perhaps it must be incrementally updated to each major version, or may require manual migration steps when upgrading. Unless the user specifics a specific package version, we must continue to install the same version by default for a particular stateVersion. If that version is no longer available in Nixpkgs, we need to throw an error with migration instructions.

    Practical examples: Nextcloud, postgresql


While [1] and [2] should continue to work fine with an opaque monotonic counter, and may benefit from not being bound to release versions, [3] and [4] seem to rely on release versions as a point of synchronization, and I’m unsure how an

There are positives to this proposal. It would allow for predictable conditions on a stable release, while not tying every stateVersion change to a release. However, it doesn’t address the original motivation of this thread, and it’s unclear how beneficial it would be over the current solution.

While I personally don’t have any interaction with third-party modules, I do agree there is utility in having a release-based stateVersion, as modules (third-party or otherwise) can easily include stateVersion conditionals without needing to coördinate to increment the current default stateVersion.

I don’t think it is accurate to assume we can drop most uses of stateVersion. Even if we no longer offer old versions of packages, or no longer include migrations, we should still throw on an old stateVersion rather than allow things to silently break with no explanation or instructions.

It is also an implementation detail of Nixpkgs, which may have effects on other projects in the Nix ecosystem.

I am not yet convinced that stateVersion being associated with a release version is a bad thing; however, I completely agree that stateVersion being visually identical to a release version is bad and confusing to many users.


If we only wish to make the value opaque to avoid confusion, we could define a new option that takes an encoded value for stateVersion, and stateVersion itself would default to a function that decodes this value.

As a trivial example, suppose on install we set a new option stateCode to a simple character translation (0123456789.abcdefghijk) of stateVersion.
e.g. system.stateCode = "cekaf"; (“24.05”)
A Nix function would reverse this transposition to set stateVersion at eval, to allow versionBefore and versionAtLeast to continue working as before.

3 Likes

I mentioned that upthread and didn’t feel it bore repeating since it was obvious, we don’t just remove things without a migration path from status quo → warning → error.

That’s also why I said a year, because that’s two releases, long enough to reach an error.

Objectively untrue; all uses are within the nixos module system.

That’s… what I said? We shouldn’t use the same string or a too-similar string.

2 Likes

I mean, looking at your list:

I’d argue this is mis-using stateVersion in the first place.

This should just be mentioned in the release notes, and/or handled with a deprecation warning. It has nothing to do with the problems stateVersion is supposed to deal with - it’s not state at all.

Ditto to the above, this is why there is alias functionality in nixpkgs.

In my mind this is mostly why stateVersion exists, however it isn’t a great solution - this still results in breakage on unstable and makes it impossible to update packages that do this without a NixOS release.

I imagine that identifying whether migration is needed could simply happen as part of the migration script an overwhelming majority of the time, resulting in much better UX.

This was discussed further up; such modules could make their package option mandatory, then stateVersion is superfluous for them.

2 Likes

Debatable. I take your point that it’s not state on disk, but it’s UX state. It’s state in the system owner’s head, expectations about how the system works, and of course this example is particularly and deliberately pointing out a case where it could be awkward to recover.

I don’t mind the idea that we can use this mechanism to let old aliases expire so they don’t accumulate forever, acknowledging that it’s not strictly state either.

I don’t really understand why it has to be tied to releases, though. Sure, bump it for every release branch, but I don’t see the problem with more bumps in between. If a change happens in August, and another in October, a system installed from unstable in September can have one of those migrations and not the other, surely? It should make it easier to work out what needs changing, if anything, in the code, and keeps these changes from getting coupled together on a release. Each change is its own bump, and we can look at the history of those bumps.

Apologies; you are correct.

I agree.

(Ab)using stateVersion currently gives module authors a way to make breaking changes without causing a significant issue for the inevitable users who don’t read release notes or pay attention to warnings. I’m not sure if that’s a good thing or not, but it’s something to keep in mind.

I forgot about the alias functionality; that’s definitely the correct way to do this.

I completely agree.

I think part of the reason stateVersion has been used in this case is to allow these migration scripts to be confidently deprecated and dropped in the future, knowing they have no unintended influence on a newly installed system.

That could be somewhat annoying, but likely far better than the current leaky abstraction of hiding it until it breaks.
At the very least, applicable packages should warn if a version isn’t specified.

In this case, I believe NixOS has a fairly specific definition of ‘state’ to mean any data not contained within the configuration or store—usually existing databases or files from a previous configuration.

Is there no existing mechanism to warn on / deprecate aliases, similar to renamed options? I’m not highly familiar with the alias system.

3 Likes

My point is that because it’s not state on disk, there are better mechanisms to handle this. If nix prints a big “this setting will change with the next NixOS release” or “this package alias will be removed with the next NixOS release” every time you eval your config for 6 months, I think it’s fair game to actually do so. This is in fact how many modules do it today.

I’m with you on stateVersion not needing to be synced to releases, though if it changes too frequently that could make bisecting difficult.

7 Likes

In practice, the “this setting will change” case means after the next release, because users who only follow releases need to see it, and means waiting 6 months to get some code tidyup or important change through. That change then needs to land in between releases and get testing, before the following release. It’s a lot of coupling and delay, where being able to make the change now for all new installs, and maybe deprecate the backwards-compat stuff later is useful.

ETA: Oh, here’s another argument in favour of bumps between releases - they could allow several goes at getting a major change right, or breaking it up into several steps, within one release cycle.

2 Likes

Yeah, and this is useful.

We could however also change policy; something like users should not expect to be able to skip multiple NixOS versions in one go. Then migration scripts can be provided with a clear promise and removed on a schedule.

This will lead to breakage for people who don’t read documentation and never update their systems, but… I mean, depending on how the policy is set that’d be years of missing updates. Skipping past that many versions will cause issues anyway.

I’m not necessarily saying this should be done, but I do think there’s a world in which stateVersion is simply removed altogether, and I do think we could reduce the number of uses and narrow its scope significantly.

This is irrespective of my desire to change it to something less easily misunderstood, I think we should take away the footgun either way.

7 Likes

At this point the conversation is really hard to follow without closely reading all the posts, and I’d propose whoever is interested in driving this to conclusion would open a PR to Nixpkgs with a README.md listing the design trade-offs and a few examples for how it would be implemented in practice. A good template for this is @roberth’s https://github.com/NixOS/nixpkgs/pull/372170/ – it’s essentially an RFC without the bureaucracy. If you think the more formal process is needed in order to facilitate consensus because it seems unlikely to converge on its own, a pre-RFC in a separate repo may be the right approach instead.

3 Likes

To elaborate a bit, I think there ended up being three largely separate topics that should get PRs/further discussion:

  1. Changing system.stateVersion to something that isn’t a NixOS version
    • There’s pretty unanimous agreement that this should be done, only the exact shape the replacement should take and when it should be bumped isn’t fully agreed on.
    • I think a PR would be plenty of space to discuss this.
  2. Deprecating system.stateVersion and making it module-specific
    • Clearly desirable, but the impact on the ecosystem and third-party modules is huge, and there is worry about the amount of boilerplate this could add.
    • A PR might alleviate (or affirm) the latter, but I think a pre-RFC is warranted and the approach may be easier to swallow if 3. has at least been considered in more detail.
  3. Minimizing the use of system.stateVersion and clarifying in policy when it should be used
    • This would be nice because it makes the concept less nebulous, and it is clearly already not a very good way to handle the problems it is being used to handle.
    • We didn’t arrive at a full conclusion, I think a pre-RFC would be useful for this, as it could mean pretty wide changes in the contract of NixOS versions, and it would also impact third party modules.
10 Likes

To add;

  1. @rycee raised a question about policy/framework for third-party projects with regard to system.stateVersion i.e. we should either designate it as an impl detail of NixOS or if it is actually meant to be a usable API, clarify how external projects could make use of it in 3rd party modules.
4 Likes

And:

  1. IF we keep anything like system.stateVersion, for god’s sake change the name (after a suitable period of deprecating the curent name, of course). We’ve been talking about how bad the name is since before most of you were born*** - including on this Discourse among other places.

*** not literally true

2 Likes