Automatically update the `stateVersion` of a system?

system.stateVersion is a confusing concept at the beginning: I was thinking before that it was simply required to provide backward compatibility by changing some path, but in fact when you look at the code and experiment a bit, you realize that an old stateVersion number can also pin some packages to an older version as seen here:

mkDefault (if versionAtLeast config.system.stateVersion "20.03" then pkgs.postgresql_11
            else if versionAtLeast config.system.stateVersion "17.09" then pkgs.postgresql_9_6

This is true for postgresql, but also for nextcloud and others.

However, people interested by latest version of packages may want to update the stateVersion number, but without breaking the state of the system (if you are in stateVersion="20.03" for example your database is located in /var/lib/postgresql/11.1, and for stateVersion="17.09" the database is in /var/lib/postgresql/9.6). Is there any script that can take care of that migration automatically? If not, is it just that nobody really took the time to write it, or is there some fundamental issue for that? Indeed, itā€™s quite hard for a user to know exactly all the commands that he could need to run to update a system from one version to another.

On the implementation side, I was thinking that it could be possible to have a file somewhere in the system that provides the current stateVersion on the system, and then during a nixos-rebuild, if the stateVersion of the system does not correspond to the one in the new configuration, then it could be proposed to the user to run all the scripts that migrate from the stateVersion of the system to the stateVersion of the package (of course, a warning should be displayed as it is then harder to ). And we could even imagine to do a script that can also downgrade, useful when a user wants to come back to an earlier version of the system.

What do you think?

The main argument I have heard against including automatic state migration was that it would be impossible to roll back the changes, negating one of the main benefits of NixOS.

As such, people are best to read the documentation for the relevant packages and perform the migration manually after ensuring they have working backups.

We could create a script that would automate this unsafe migration (e.g. run pg_upgrade for PostgreSQL, install all major Nextcloud versions between the old and new stateVersion and successively run migrations on themā€¦) but I do not think most of the software supports down-migration so integration into nixos-rebuild is probably infeasible.

2 Likes

But when the user chooses to manually perform the migration, then he will also run into the same troubles, and it wonā€™t be easier for him to rollback, so Iā€™m not sure to see why it would be better to let the user do it by himselfā€¦ I even think that the current way is much more error prone for two reasons:

  1. some people may just change stateVersion without realizing that it can break the configuration, and a warning could stop them doing so
  2. even if the user realizes that he needs to manually upgrade some programs, the exact steps to do are not clear at all. He needs to go over all the release notes, try to guess what it meansā€¦

For example, I donā€™t see what commands I should run to upgrade to 20.03. Even the following instruction is not clear to me: do I even need to do something or is it done automatically for me? If not, can I still roll back?

As long as the system.config.stateVersion is below 19.09 the state folder will migrated to its proper location ( /var/lib/systemd/timesync ), if required.

I would agree with your argument if the script would be run automatically, but in my mind it should just be a helper script, and thatā€™s why we should ask to the user if he wants or not to apply that script, with all the warnings it may require.

Then, Iā€™m not sure that we do need a roll back script: if the user realizes that there is an error with the new stateVersion, then he could just change the stateVersion back and the old postgresql database will be used as before, no matter wether the new database is an export of the old one or not (which is the current behaviour of nix anyway). We usually donā€™t want to move the database back if we realize that there is an issue (nixos just needs to make sure that the migration script does not break the old state use, which is quite easy I think (you can just do an upgrade and keep the old database in place). Of course we could add a second ā€œpurgeā€ script that could be used when the user is sure that he wonā€™t need to roll back, but itā€™s another question).

And still, even if the user still wants to get the new postgresql database into the older postgresql database (say that he ran this new version for some days before realizing he wanted to roll back), then I would be curious to see how many modifications are really impossible to roll back. Even for postgresql, I guess itā€™s still possible to create a rollback operation that dump the new database and tries to import it in the old one. And when this is really impossible, then we just change the stateVersion without running any script, and it should be exactly as it is currently in nixos.

To summarize, I canā€™t see any disadvantage of writting a helper script: if the user refuses to run this script, then we are back to the current behaviour of nix. If the users does wants some help to migrate the stateVersion, then I think itā€™s safer to just provide some ready to use script rather than letting him guess what to do from the release notes. And the script would be anyway helpful for people that do not trust the script but would like to grab some inspiration from itā€¦

From my point of view it works like this:
When I upgrade a system to a new NixOS release, I will manually check what services have changed their default versions, create a backup of the existing data, set the package option of the service to select the new version and perform the migration. Some services donā€™t have automatic migrations, and manual intervention is required in any case. Sometimes I donā€™t want to upgrade all services to the new default versions.
If Iā€™m certain that I have migrated all relevant state, I might change the stateVersion.

I think itā€™s important to note, that the stateVersion does not cause any packages to be ā€œpinnedā€, but rather sets the defaults.

If we had a helper script that could perform all migrations, we could just run it as part of the system activation after a stateVersion change, which would make it as easy as changing the stateVersion.
But this would add a ton of complexity and I think itā€™s not possible with our resources to support all services with the required reliability and all edge-cases.

1 Like

And how do you do that check, you need to go in the code of NixOS to see precisely what needs to be changed? Because from the release file it seems pretty hard to guess it.

I understand that it may require a bit of man power to write and update the scripts, especially if we want to cover all edge-cases. After, the number of update seems pretty small (basically only 3 modules in 20.03 need migration, obtained after a grep in the nix source codeā€¦), and the script could be more of a help script, so we donā€™t even need to cover all crazy edge-case, we can always print the code of the script and let people see if they want to run it or not (people with crazy setup can usually see if an upgrade script fits their need or not, and beginners would be fine with much update scripts).

But at least if we donā€™t provide any script, the documentation should be as clear as possible (itā€™s hard to ask to beginners ā€œplease read the source code of nixpkgs to see what you need to doā€). Postgresql has an exemple of code in the documentation, but for example how should I update nextcloud? The nextcloud documentation seems to say that we just need to move the data and config, so in nix it translate just to backup the nextcloud home folder and set services.nextcloud.package = nextcloud18 or there is something more involved to do?

Surely all users run into this problem, not just men?

Iā€™m not convinced we can safely write a helper script for arbitrary state migrations that wonā€™t cause issues.

As it stands today, we have a scary warning in the default configuration telling users not to change the value. Perhaps the wording could be reworded, I donā€™t know. In the past people have proposed changing the stateVersion to being an opaque integer rather than a string matching the nixpkgs version; this will make people a lot less likely to arbitrarily bump it.

Beyond that, I think what would be valuable is having a script that can look at the userā€™s configuration and say ā€œif you bump the stateVersion to the latest, hereā€™s what will changeā€. This could perhaps be done by evaluating all options at the current configuration, then doing it again with the stateValue overridden, and diffing the two. Weā€™d need to wrap this up in some nice formatting of course, and having it be intelligent, e.g. if I donā€™t have postgresql enabled I probably donā€™t care that the defaults will change. If we diff the config tree rather than the option tree that would take care of that automatically, but itā€™s probably worth showing the option differences too, so I can tell that e.g. if I donā€™t upgrade stateVersion and I do later enable postgresql that it will default to an older package.

Sure, sorry if my phrasing was clumsy, as you surely noticed Iā€™m not a native english speaker. Iā€™m aware of the ā€œinclusive writtingā€ in french but no idea if ā€œsā‹…he will [ā€¦] for hā‹…imā‹…er to rollbackā€ makes sense in english, feel free to propose if you have nicer way to express that.

Oh, maybe not. But again, itā€™s just to help the user, not to replace hā‹…imā‹…er. Sā‹…he still have the last word on whether sā‹…he wants to run it or not.

This scary warning is really important, I agree. But it also has the downside that some people may think that itā€™s really pointless to try to upgrade somehow this number, and they may end up with old/unmaintained/unsecure packages without even realizing. I do like the fact that you can see the year in that number, as people may wonder at some point ā€œoh, this number is super old, maybe I could try to see what it means and how/why I could migrate it. The warning tells me not to change it directly, I will read more.ā€.

That would be definitely a great idea, and it could definitely help, with or without a helper script (both would be complementary of each other).

Actually, Iā€™m thinking that if nix does not want a helper script inside nix, we can still write it ouside nix, and eventually package it as a standard nix package, with a command which could look like:

$ upgrade-state-version --current-state 17.09 --new-state 19.03
DISCLAIMER: upgrading the system state is potentially dangerous
and can lead to data lose. 
This script CANNOT detect all unusual configuration, and is just
provided as a helper script. DO CHECK what command it does,
and first do a BACKUP of all your data. 

We detected that the following modules are installed and should be migrated:
- [1] postgresql
- [2] nextcloud
In a first step, you will be asked to review the migrating code for each module. You will be able to check the options of the script and then you will have the choice to:
 - abort the whole migration: a
 - validate the script: v
 - edit the script: e
 - skip the script (you will then need to perform the migration for that module manually): s
At the end of the review process, you will need to do a last confirmation step
in order to actually run all the validated scripts. At the end, you should be able to change:
Should we continue?[y/n] y
#### Review of migration of postgresql ####
Are you thinking of automatically migrating postgresql?[y/n] y
What is the path of the current database?
(/var/lib/postgresql/9.6/ was detected, simply press Return to validate it)

What is the path of the new database?
(/var/lib/postgresql/11.1/ is proposed, simply press Return to validate it)

The script to update postgresql will be the following:
    ## This script migrates the database from postgresql_9_6 to postgresql_11
    set -e
    export NEWBIN="$(nix eval --raw nixpkgs.postgresql_11.outPath)/bin"
    export OLDBIN="$(nix eval --raw nixpkgs.postgresql_9_6.outPath)/bin"
    export OLDDATA=/var/lib/postgresql/9.6/
    export NEWDATA=/var/lib/postgresql/11.1/

    # Create the database folder
    install -d -m 0700 -o postgres -g postgres "$NEWDATA"
    cd "$NEWDATA"
    # Initialize the database folder
    sudo -u postgres $NEWBIN/initdb -D "$NEWDATA"

    # Stop the old postgresql
    systemctl stop postgresql

    # Migrate the database
    sudo -u postgres $NEWBIN/pg_upgrade \
          --old-datadir "$OLDDATA" --new-datadir "$NEWDATA" \
          --old-bindir $OLDBIN --new-bindir $NEWBIN

do you want to Validate, Edit, Skip or Abort?[v/e/s/a]
v
The postgresql migration is validated.
#### Review of migration of nextcloud ####
Are you thinking of automatically migrating nextcloud?[y/n] n
The nextcloud migration is skiped.
#### Last confirmation ####
The script that you will run is the following:
    ## This script migrates the database from postgresql_9_6 to postgresql_11
    set -e
    export NEWBIN="$(nix eval --raw nixpkgs.postgresql_11.outPath)/bin"
    export OLDBIN="$(nix eval --raw nixpkgs.postgresql_9_6.outPath)/bin"
    export OLDDATA=/var/lib/postgresql/9.6/
    export NEWDATA=/var/lib/postgresql/11.1/

    # Create the database folder
    install -d -m 0700 -o postgres -g postgres "$NEWDATA"
    cd "$NEWDATA"
    # Initialize the database folder
    sudo -u postgres $NEWBIN/initdb -D "$NEWDATA"

    # Stop the old postgresql
    systemctl stop postgresql

    # Migrate the database
    sudo -u postgres $NEWBIN/pg_upgrade \
          --old-datadir "$OLDDATA" --new-datadir "$NEWDATA" \
          --old-bindir $OLDBIN --new-bindir $NEWBIN
The above migration script has been copied in /tmp/migration_script.sh.
Do you want to run it now and perform the migration?[y/n]y
#### Migration ####
The script will be run...
[...]
Migration done!
Note that you wanted to manually perform the changes for nexcloud, DON'T FORGET to do it !
You can then check the changes by changing in your configuration `system.stateVersion = "20.03"`.

The gender-neutral pronouns in English are ā€œthey/them/theirā€, as in ā€œthey will [ā€¦] for them to rollbackā€. Yes itā€™s the same as the plural pronouns.

The downside here is the script would need to be customized for every single change; it canā€™t just generically know how to migrate postgresql unless itā€™s been taught how to migrate postgresql. And even then it wonā€™t know if there are any potential issues when migrating to any particular postgresql version unless itā€™s been taught about that version too.

Someone could still attempt this anyway, it just seems like a rather high maintenance burden.

This should be clear from the release notes. For example with stateVersion >= 20.09 the default deluge version was changed from 1.3.x to 2.x. The following text is part of the upcoming NixOS 20.09 release notes:

If you are upgrading from a previous NixOS version, you can set service.deluge.package = pkgs.deluge-2_x to upgrade to Deluge 2.x and migrate the state to the new format. Be aware that backwards state migrations are not supported by Deluge.

But you asked how I do it, and to be honest I never read the release notes because Iā€™m lazy af :sweat_smile:

I usually keep track of the relevant modules in my head. By reading documentation and/or code of the modules that I use, I get to know things like that. Once I have seen that a specific module uses stateVersion, I will know it next time. Also, letā€™s be take a look at how many modules are actually using it:

[pbb@onyx:~/proj/nixpkgs]$ rg "stateVersion" nixos/modules/services/
nixos/modules/services/web-servers/caddy.nix
71:      environment = mkIf (versionAtLeast config.system.stateVersion "17.09")

nixos/modules/services/x11/desktop-managers/xterm.nix
17:      default = (versionOlder config.system.stateVersion "19.09") && xSessionEnabled;
18:      defaultText = if versionOlder config.system.stateVersion "19.09" then "config.services.xserver.enable" else "false";

nixos/modules/services/network-filesystems/ipfs.nix
17:  defaultDataDir = if versionAtLeast config.system.stateVersion "17.09" then

nixos/modules/services/web-apps/nextcloud.xml
121:   <link linkend="opt-system.stateVersion">stateVersion</link> is declared properly. In that case

nixos/modules/services/web-apps/nextcloud.nix
45:  inherit (config.system) stateVersion;
350:          else if versionOlder stateVersion "20.03" then nextcloud17

nixos/modules/services/torrent/deluge.nix
183:      if versionAtLeast config.system.stateVersion "20.09" then

nixos/modules/services/misc/matrix-synapse.nix
356:        default = if versionAtLeast config.system.stateVersion "18.03"

nixos/modules/services/networking/radicale.nix
12:  defaultPackage = if versionAtLeast config.system.stateVersion "17.09" then {
38:        <literal>system.stateVersion &lt; 17.09</literal> and version 2.x

nixos/modules/services/networking/supybot.nix
24:        default = if versionAtLeast config.system.stateVersion "20.09"

nixos/modules/services/desktops/gnome3/gnome-initial-setup.nix
14:  # To prevent this we create the file if the users stateVersion
71:    ++ optional (versionOlder config.system.stateVersion "20.03") createGisStampFilesAutostart

nixos/modules/services/networking/syncthing.nix
393:            nixos = config.system.stateVersion;

nixos/modules/services/databases/postgresql.nix
226:        default= if versionAtLeast config.system.stateVersion "17.09" then "postgres" else "root";
245:      # ā€˜system.stateVersionā€™ to maintain compatibility with existing
247:      mkDefault (if versionAtLeast config.system.stateVersion "20.03" then pkgs.postgresql_11
248:            else if versionAtLeast config.system.stateVersion "17.09" then pkgs.postgresql_9_6
249:            else if versionAtLeast config.system.stateVersion "16.03" then pkgs.postgresql_9_5
253:      mkDefault (if versionAtLeast config.system.stateVersion "17.09"

nixos/modules/services/continuous-integration/hydra/default.nix
40:  inherit (config.system) stateVersion;
252:        else if versionOlder stateVersion "20.03" then hydra-migration

nixos/modules/services/databases/mysql.nix
303:      mkDefault (if versionAtLeast config.system.stateVersion "17.09" then "/var/lib/mysql"

[pbb@onyx:~/proj/nixpkgs]$

You get the idea, itā€™s not that many. The following services are currently using stateVersion to set the default package version as far as I can tell:

  • nextcloud
  • postgresql
  • mysql
  • radicale
  • supybot
  • deluge

Of those I just use postgresql and deluge. Not so hard to keep track of, right?