What should stable NixOS prioritize?

FRidh · November 5, 2020, 4:54pm

Yes! Shorter time between branch-off and release but with a freeze period for core changes before it.

Of course at that point those running master/unstable will be more affected.

jonringer · November 5, 2020, 9:50pm

Ideally master would be “well polished” at that point. Do you mind clarifying how unstable users would be more affected?

tobiasBora · November 5, 2020, 9:52pm

I also think that stability is more important, especially in nixos: if I really want one recent software, I can still install that software from unstable and keep my system on stable. Also, I’m curious to know: is there enough man power to deal with two releases per year? I’ve the feeling that it takes quite some energy to release and maintain a stable branch (needs backports, but the backports cannot usually be done without some changes…), and I don’t know if it is really important to have two releases per year. And if this energy could be used to maintain a more stable unstable branch, I would say it could be interesting.

jonringer · November 5, 2020, 10:01pm

The branches diverge over time, at a certain point, it becomes non-trivial to backport changes from master into the release branch. This is also why NixOS will probably never do a 4-5 yr LTS. That’s just too much upkeep, and probably making in-house patches for stuff to still build.

Yes, but I think we just have to do the practical thing about what we are supporting. Most distros don’t ship with 17 desktop managers, 34 window managers, and 60k+ packages. One of the reasons why I was proposing the delay to YY.05 and YY.11 release dates was to help align the releases with “good points in time” for desktop managers on unstable. Which constituted the most “in-house” work for 20.09. In this last release, plasma was the biggest pain, as we were shipping systemd 246, a mix of qt 5.12, 5.14, and 5.15 with plasma 5.18.5 which really shouldn’t exist together.

Shipping plasma 5.19 could have been an option, but it would have been EOL a week or two after the original target release date, and that’s awkward in case there is security issues, 5.18 is an LTS so we can rely on upstream for providing patches, but not for being compatible with things like the latest systemd. Moving the release date to november would mean that we could have packaged plasma 5.20, and kdeApplications on master before ZHF would start; and the 5.20 plasma release would have been supported for the majority of the 20.09 NixOS release. For gnome, we would just be lagging behind their releases by 4-6 weeks, which should be enough time for them to be packaged in nixpkgs.

FRidh · November 6, 2020, 8:13am

A freeze period means certain changes won’t come in for a while which people running a “rolling distro” may expect.

jonringer · November 6, 2020, 8:34am

cadence for staging is roughly 2 weeks when we aren’t “cramming”. It would a single ~2 week cycle skipped. PRs for bumps would still be allowed into master as long as they wouldn’t cause any regressions.

tobiasBora · November 6, 2020, 10:02am

Make sense, thanks a lot for the clarification! For me any release time is good, even if it’s not always the same month depending on what’s practical for you. And thanks everybody for the great work, you’re amazing

bhipple · November 7, 2020, 2:48pm

I know others have chimed in already, but I’d like to repeat that from my perspective shifting to NixOS 20.05 and 20.11 is essentially zero cost / downside, aside from the initial one-time churn and confusion. If @jonringer and other release managers think this simple shift will make a big positive difference I’d say we should do it. It sounds like there’s a lot of other stuff we can and should improve, too; but a simple +2m shift is a really easy hack.

nh2 · November 7, 2020, 6:24pm

We use NixOS for business (https://benaco.com) on the server and I use it on the desktop.

My opinions:

If shifting the release months reduces work, let’s do it. Companies don’t care in which months the stable releases are, only that they exist.
Heavy -1 on going rolling-release only. This would immediately disqualify NixOS for most companies. OS upgrades are already some of the riskiest things we do, because they change lots of moving parts. There is extreme value of the same upgrade paths being tested by lots of people (from the same versions, to the same versions). By having defined stable releases, it is easy to check that upstream package P actually supports upgrading from version A to version B. The ability to rollback does not address the rolling-release risk: Some stateful software (for example consul) does not support downgrading at all; after rollback, it will not start up. Lots of upstream software supports downgrading only a few versions. Stable OS releases help a lot here.
Basic -1 on yearly releases. I think 6-months releases are the sweet spot across the effort to upgrade, easy security patches backporting, and having new enough software for businesses.

aanderse · November 7, 2020, 8:28pm

@nh2 I strongly agree with all points on your post.

Ekleog · November 9, 2020, 8:17pm

Out of curiosity, is the stateful software problem not properly handled
by stateVersion? If not, do you have any idea how we could improve
stateVersion, or what other mechanism we could set up to reduce the
impact stable version upgrades have?

The problem I personally have in mind is, that stateVersion requires one
to bump all the stateful software of the system at the same time without
allowing for component-by-component upgrade. This problem would be
solved by splitting stateVersion into one stateVersion per component
(that would be a required option), but I’m not sure this idea would also
actually alleviate the issues you’re seeing.

nh2 · November 10, 2020, 1:04am

Regarding stateVersion, I see some issues:

It’s rarely used in nixpkgs. The postgresql service uses it, but e.g. consul doesn’t; we’d have to package multiple versions of a lot more software to even make it a possibility. It’d also require providing multiple versions of entire language toolchains. That’d be quite an effort.
stateVersion results in people running old versions. Many projects don’t actually provide point releases with security fixes for older versions.
Even if the two issues above didn’t exist, I believe stateVersion would still not make a rolling release as good as stable releases, simply because stable makes 1000s of users go through the same upgrade from/to versions. This makes it easy to build a knowledge base of what fails during an upgrade, so that:
1. People doing production stuff can wait a while before the upgrade and get a pretty good idea what is going to work and what isn’t, reducing risk;
2. We can make fixes to the upgrade experience on the stable branch! That is completely impossible on rolling, where having a fixed upgrade experience might easily imply upgrading lots of other software to new major releases (some of which may not be backwards compatible enough; upstream may not allow you to skip a release).
In a rolling system, everybody would have to make their own picks between which two commits to upgrade, potentially having to bisect the history to find suitable commits for your own stateVersion in combination with the software you currently use. That’s quite an effort, and risky (nobody may have tried upgrading across that commit pair).

Not having component-by-component upgrades is a big issues, but that’s not specific to stateVersion; it applies equally for releases. It’d be great to be able to opt into per-component upgrades for heavily stateful services, I like that idea. For our setup we actually have our own Ceph packaging and services instead of using the nixpkgs ones, simply because that allows us to upgrade Ceph and NixOS independently (doing them simultaneously would be too risky for us).

By the way, I think the current description comment on stateVersion in the configuration.nix, isn’t good enough. It’s truthful but it should be way more explicit that if you don’t update this, you will run e.g. an old postgresql. It refers the user to the manual docs for the option, but I think it should at least directly suggest to the user in that comment that it’s an actual good action item to eventually upgrade all services that use it and then bump it; the current wording of the comment sound a lot like “don’t touch it” so most people don’t even look up the details in the manual.

7c6f434c · November 10, 2020, 10:07am

Out of curiosity, is the stateful software problem not properly handled
by stateVersion? If not, do you have any idea how we could improve
stateVersion, or what other mechanism we could set up to reduce the
impact stable version upgrades have?

It is not actually aiming at the problem in question.

For many services it describes NixOS choices like directory layouts, but does not freeze the package or its ideas of state format or whatever.

The idea is that you can keep stateVersion the same roughly forever and still do package updates etc., including for services.

Ekleog · November 10, 2020, 5:18pm

Thank you for this feedback!

FWIW, the issues with using modules from various versions of nixos is
what led me to start experimenting with alternatives to the module
system, which resulted in [1], with an example of how to use it at [2] —
unfortunately, now that the proof-of-concept is done I’ve lost
motivation to move forward with it, until I find a way to not have to
rewrite nixos modules to use them there at least — it’d require a
massive effort to do the rewrite alone otherwise. So the project is on
hold for now, but feel free to poke me if you’d be interested in talking
about it further!

[1] GitHub - NixtOS/nixtos: NixtOS, the next-generation NixOS that builds on both GuixSD concepts on steroids and nixpkgs.
[2] https://github.com/NixtOS/nixtos/blob/f91a7fdd07cc488ed7cf1d950b2e4c3b2bb80f67/tests/example.nix

Ekleog · November 10, 2020, 5:22pm

Out of curiosity, is the stateful software problem not properly handled
by stateVersion? If not, do you have any idea how we could improve
stateVersion, or what other mechanism we could set up to reduce the
impact stable version upgrades have?

It is not actually aiming at the problem in question.

For many services it describes NixOS choices like directory layouts, but does not freeze the package or its ideas of state format or whatever.

The idea is that you can keep stateVersion the same roughly forever and still do package updates etc., including for services.

Hmm… I guess opinions vary among maintainers? For instance, postgresql
is a good example of program that you can’t upgrade without bumping
stateVersion, but it’s true that a number of services don’t actually
receive the same amount maintenance work required to do that (and I
plead guilty to most likely have missed places where stateVersion should
have been used, though I’m not sure even if I knew them I’d put in the
sometimes significant amount of work required to change that beyond
writing release notes).

7c6f434c · November 10, 2020, 7:50pm

Hmm… I guess opinions vary among maintainers? For instance, postgresql
is a good example of program that you can’t upgrade without bumping

PostgreSQL is special in terms of a single version not even being
sufficient to upgrade from a previous one, you need both binaries or
a text dump…

jonringer · November 30, 2020, 8:19pm

For cross-reference.

RFC for changing release months: [RFC 0080] Change NixOS releases to YY.05,YY.11 by jonringer · Pull Request #80 · NixOS/rfcs · GitHub

jonringer · December 17, 2020, 5:12pm

[RFC 0080] Change NixOS releases to YY.05,YY.11 by jonringer · Pull Request #80 · NixOS/rfcs · GitHub has been accepted.

New release timeline is YY.05 and YY.11

matklad · September 18, 2021, 6:30am

So, after using unstable for my desktop for about a year, this is a much much better experience: I always get the latest version of software, I don’t have to re-check when the new version of NixOS is released, and, for me at least, there’s no perceived differences in stability.

It does seem now that unstable is a misnomer, and that it’s just a rolling release.

Ericson2314 · October 3, 2021, 6:57pm

Of borg has helped a lot. But by the same token there must be something that makes releases more, or else why are they so hard to cut?

The work that goes into releases benefit the unstable channel, so it’s important to remember (not saying you weren’t) that unstable is not a good representation of what we’d have if we only had a rolling releases.