Why does NixOS not have a rolling release system?

dschrempf · November 12, 2021, 5:00am

This is a question that has been nagging me for a long time. Why is it that we have releases with specific versions? Why not just have nix[pkgs,os]-stable, and nix[pkgs,os]-unstable? Nixpkgs and NixOS seem predestined to be rolling release.

I can see that there are some references to stateVersion in the Nixpkgs repository code. However, on first glance, most of them are avoidable. For example, for Nextcloud, services.nextcloud.package is set to a value dependent on the stateVersion. I suggest, making services.nextcloud.package a mandatory, and not an optional setting, in order to avoid usage of stateVersion.

I think with this handling, one could get rid of stateVersion altogether, greatly easing the release cycle process. On the other hand, I don’t have experience with the release cycle (ZHF, etc.). Maybe you can enlighten me?

cdepillabout · November 12, 2021, 5:14am

Why not just have nix[pkgs,os]-stable , and nix[pkgs,os]-unstable ?

How would you imagine a nixpkgs-stable and nixos-stable work? That is, how would they be different from the current nixpkgs-unstable and nixos-unstable?

While I don’t run nixos-unstable, I know that there are a lot of people that do. I’m under the impression that it ends up being pretty “stable”. What additional guarantees would you want from a nixos-stable?

dschrempf · November 12, 2021, 5:30am

I imagine the stable channels lagging behind. I run nixos-unstable, and it is pretty stable, when the build succeeds. However, this is often not the case (For example, right now, filelock is missing from Python 2.).

uep · November 12, 2021, 6:27am

I think it’s relevant to ask what problem you expect or want it to solve.

One possible interpretation of -stable is that it’s an alias for the current release -21.05, and would switch to be an alias for -21.11 when (or shortly after) that release gets finalised. This solves the “problem” of having to explicitly call nix-channel to follow the “current release”.

Another possible interpretation is that you want a different (rolling) release cycle, either instead of, or in addition to, the current half-yearly releases. That solves a “problem” of releases having some stale versions of things.

This suggests something more like the latter case. But it’s not just a time-based criterion, there needs to be some other definition of “stable” and selection for which packages get promoted and when, which means a different QA and development process around release management.

I can see the appeal, but will point out that the distinction is perhaps not quite as stark as you might expect:

stuff gets pulled up to the current release branch pretty regularly, and it’s even policy or necessary for some packages that this happens (signal desktop comes to mind).
-unstable is already basically a rolling release, and code goes through several earlier branches (staging, staging-next, master) before the successful build gets tagged (and usually the -small variant tag has been a little ahead of -unstable as well).

As you point out, it’s generally pretty good (I run it too!) and the name is at least in part to just set reasonable expectations from end users. Of course, it’s always possible to do better.

Ultimately that boils down to more tests, regardless of whether they’re added in the -unstable build or an additional phase that follows to some hypothetical -stable or -rolling tag. Honestly, everyone would love to have those tests, and having to make multiple builds to split out those tests into stages doesn’t seem like the hard part of the problem.

NobbZ · November 12, 2021, 6:56am

This would be quite the opposite of “stable”, as it would introduce changes that are very likely to break your system every 6 months.

The purpose of the versioned channels is to not introduce those changes.

If one doesn’t care for occasionally breaking stuff due to updates, unstable branches are fine and can be considered a rolling release, thats why the proposal to rename the “unstable” channels to “rolling” instead comes up every now and then.

Unconditional delays wouldn’t help anyone. This would cause important security updates comming only late to the stable branches, while at the same time it wouldn’t stop build fails from comming through.

Python 2 is EOL and should get removed from nixpkgs in my opinion… Probably not with 21.11 as there was no proper deprecation within the tree, but that should be announced with the release and then removed next May… (This is a opinion which I know is not very popular)

dschrempf · November 12, 2021, 12:46pm

I see, thank you for your replies!

I was looking for something like nixpkgs-unstable, just more stable :-). By this, I don’t mean the actual packages, but the stability/reliability of the evaluation so to speak (i.e., Nixpkgs). I still think that this is a valid idea. Security fixes, for example, could be merged to the stable rolling release channel right away.

I thought that maintaining two (three?) releases is more work than having one rolling release channel.

Also, I don’t really like that we have stateVersion. I don’t think it is necessary, and it confuses me and a lot of other people. It should not be there. My configuration should not depend on when I installed NixOS (or started using Nixpkgs). But then, I only see some uses of stateVersion, and those can be avoided. Maybe there are other use cases, I am not aware of.

With respect to things breaking. Shouldn’t that be the case? If a breaking change is introduced, it should jump in your face. Hey, your Nix expression doesn’t compile because there is this thing you should amend/check/whatever. Of course, there sohuld not be any breakages after evaluation. If the configuration evaluates, it should bulid and work. For example, with respect to the Nextcloud stateVersion usage. After making services.nextcloud.package mandatory, everybody who had not set it, has to set it.

With respect to Python 2 being EOL. I agree. Nevertheless, there is always something inhibiting evaluation. This time, it was good old, unnecessary Python 2.

Thanks again for your input!

sternenseemann · November 12, 2021, 1:58pm

The whole point of released versions is that there is a guarantee that there won’t be any breaking changes, both to the modules and the distributed packages (as far as it’s possible). This is critical for deployment and maintenance to be sustainable (on a larger scale or if you have limited time): You can be sure that no update will break your configuration, making it simple to keep a system up to date and secure.

Specifically it means that you can plan for adjusting your configuration(s) every half a year (which can be quite time consuming for a lot of systems / non-trivial configurations) instead of having to adjust your configuration to a random breaking change when you just wanted to update your system to get the next OpenSSL security fix.

nixpkgs’ declarativeness and evaluation model make backwards incompatible redesign “safer” than on regular distributions, but adapting your system to breaking changes still requires administrator time, updates however are often time critical.

dschrempf · November 12, 2021, 3:40pm

@sternenseemann this is a convincing argument, thank you.

It seems to me that this exposes one of the (very few!) weaknesses of Nix. Namely, that you cannot (at least not in an easy way) pull in one change (let’s say a security change) without also pulling in all other changes that have been made (merged) in the meantime. On Arch/Debian/…, you can just postpone updating a major version of PHP, if you really wanted. Of course, then you also don’t have any guarantees that your package set plays together well.

Mic92 · November 12, 2021, 3:43pm

It’s quite possible to pin an older version of a package by importing an older nixpkgs version. We also have all maintained php major version to choose from anyway.

TLATER · November 12, 2021, 4:04pm

Just to add to that, in fact, it’s easier in NixOS because you have a stronger indication that your package sets will play together nicely, because their dependencies can retain the version they were without compromising security for other packages (because the nix store is great).

I.e. if nextcloud doesn’t work on a newer php you can pin nextcloud’s php, but upgrade everyone else’s.

The problem are the more hidden interactions, say between X and a GUI application, where NixOS behaves exactly like other distros. The advantage of a versioned stable branch is that you don’t need to worry about any of this (or file a bug upstream).

dschrempf · November 12, 2021, 6:58pm

Yes, Nix is great with having different versions of packages installed. I chose a bad example. I was referring to the fact that we as Nixpkgs users can only pull in all the changes merged into Nixpkgs at once, some of which may be breaking. We can not only merge in security related changes, for example.

xfix · November 12, 2021, 7:05pm

On Arch Linux partial upgrades are explicitly unsupported, see System maintenance - ArchWiki.

dschrempf · November 12, 2021, 8:04pm

Maybe I am miscommunicating. I don’t want to do a partial upgrade. I want to do a full upgrade.

However, I cannot do this, because my configuration does not evaluate (because of some dependencies). I could now disable the affected derivations, or pin them to older versions but that is a lot of work. What I tend to do is wait until the problem is fixed in Nixpkgs.

However, this sometimes takes weeks. Right now, Google starts to complain that my browser is outdated although I am running nixos-unstable. I would love to update all packages but the ones failing to evaluate because of the faulty dependency. Do you see my point? I never had this problem with Arch Linux, which I had been using for a long time.

So I was thinking, can we improve on this? Maybe a nix[pkgs,os]-stable channel would be nice alternative, but I see there is a lot of tension and counter arguments.

jonringer · November 12, 2021, 10:34pm

This seems to be nextcloud specific. Looks like the intention was to make the upgrade path more ergonomic for people upgrading their system, as nextcloud upgrades seem to only support bumping a single version. And they wanted to enforce that people first bumped to v20 before going to v21, or bump to v21 before going onto v22.

I agree in this situation, it should probably be on the user to determine the package version as this looks to be stateful operation, and nix doesn’t really have a good way to do stateful upgrades.

stateVersion in practice shouldn’t do much except determine where things are located. For example, stateVersion will help determine where a postgresql database should be located, but not much else. The stateVersion should be more like, “where you should be able to find things”, and less “determine the version or configuration of something”.

Also, the releases don’t really know about the stateVersion, but modules are aware of stateVersion, which use the releases as a convention.

jonringer · November 12, 2021, 10:44pm

We have stable releases, current stable is nixos-21.05. As others have mentioned we just have a 6 month cadence on keeping it alive. [With our current volunteer manhours (people hours?)] we will never have an lts like branch, it just requires too much work to constantly backport potentially relevant items. It’s hard enough doing that with just a difference of 6 months let alone years.

I could now disable the affected derivations, or pin them to older versions but that is a lot of work. What I tend to do is wait until the problem is fixed in Nixpkgs.

Generally I will fix them in a local checkout of nixpkgs, then chunk up the changes into PRs and upstream them. You can apply changes locally by doing, sudo nixos-rebuild -I nixpkgs=$PWD switch, or use path urls if you are using flakes.

I don’t intend for all users to upstream PRs, but it is nice. And when a PR fixes a package, they are usually quickly merged as they are generally a net positive.

Right now, Google starts to complain that my browser is outdated although I am running nixos-unstable .

One thing I do is update my configuration.nix and home-manager independently. Generally they will have smaller scopes, so it’s less likely that a particular evaluation will fail.

uep · November 12, 2021, 10:56pm

Ahah! TIL.

I assume it’s most useful to do this with a checkout of the current tagged revision, so that the majority of stuff still comes from the build cache - but very useful tip. Thankyou.

jonringer · November 12, 2021, 11:47pm

Population of the nixpkgs cache is a continuous process. The jobset gets triggered every 4 hrs (IIRC), so there’s rarely a time when there’s not a populated cache.

Updates to the release channel can see long delays, if certain things are failing. https://status.nixos.org/

uep · November 13, 2021, 12:36am

Fair enough. I was conflating two things:

likelihood of being in the cache (yet)
likelihood of actually building

What I actually want is the latter; if my goal is fix/tweak/upgrade a particular particular package to PR, I mostly don’t want to trip over other unrelated build failures. So unless I’m trying to fix one of those build failures holding the release channel back, I’ll use that branch.

Of course the cache is very useful too, I just hadn’t really thought about the fact that it will have (most) stuff ahead of the channel. Which also means I can use this the other way. If there’s a fix for something I want in the tree, ahead of the current channel, but before something else that’s broken the build and prevented channel updates for days (as has happened recently), I can pick that revision and still have stuff from cache. Or even take something more current, if the build failure is something I don’t use.

uep · November 13, 2021, 1:07am

I suspect this is, at least in part, an issue with visibility. You see into earlier parts of the pipeline. With most distributions, you just check for updates, and if there are none or few, you don’t really think about it, and you don’t really see the build failures to even realise you’re waiting for them.

Ma27 · November 15, 2021, 3:05pm

This is perfectly doable via services.nextcloud.package = pkgs.nextcloud22; for instance. This is also the preferred way to do. However Nextcloud is inherently stateful and if you accidentally deploy some wrong config, you can screw up quite much (been there, done that), so a few helpers don’t hurt. Also I don’t expect an administrator to keep track of everything that happens in an upstream package (as a maintainer of this, it’s my job!), so it’s better to try solving most of the issues (considering that the alternative to using NixOS is “just pull this Docker image”).

If one’s interested, I shared some thoughts in https://nixos.mayflower.consulting/blog/2021/01/28/nextcloud-stateversion/.