What should stable NixOS prioritize?

garbas · October 30, 2020, 8:16pm

Change in the schedule is not a problem. Problem is that deadlines are important for server users, since you need to plan your activities and postpone others.

I don’t think it was said that this will solve the problem, but it would definitely help with requiring less effort. I think shifting the release schedule for 2 months is the least we can do and this can be already done for next release.

What we could try - maybe in a year - is to roll out a DevOps focused NixOS release (separately branded then the current release) scheduled at the time that suits DevOps world and maybe even helps with the stabilization of the main NixOS release.

Ericson2314 · October 30, 2020, 8:32pm

I agree it’s more effort. Though I think a big portion of the effort is “just” getting more stuff building, so there are automation gains to be had there. The manual QA also can come from dogfooding a bit, thankfully.

All my paid NixOS projects involved software being deployed and developed. Clients’ developers being happy co-developing Nix packaged software was just as important as clients’ sysadmins being comfortable deploying it.

And, at least as we pitched it, a major business case for Nix is synchronizing developer and production builds so that what’s being worked is is actually what’s being deployed and vice versa (say with reproing prod bugs), so I don’t think saying “oh well just deploy with Nix for now and Nixify the dev env later” is an efficient compromise.

I’m agree there are issues, but I’ve seen more developer pushback than ops pushback per the above, so I’d like to focus on that weaker link.

There many cool things happening in the latest Nix, but I don’t get the sense that anything is around the corner. (For the record, I am very skeptical of flakes, the TOML thing, etc.) I think we could easily end up with the “last 10%” (or as the joke goes, “second 90%”) situation you described as bedeviling the server use-case.

Also, to be fair, development environment work and NixOS desktop work also don’t overlap in man ways too. I don’t mean to trying to pretend they are synonymous, but I can see as I switched between points rather hurriedly, it might seem that way.

I think my major points are:

New casual contributors won’t save us
Dev workflows are the weaker link to fix for bringing in more funding
Better dev workflows also unlock upstream mind-share
Upstream mind-share could help a lot to make NixOS/Nixkgs easier to maintain, more than new downstream-only contributors would

cbot · October 31, 2020, 1:37am

If I could toss my opinion in as a new NixOS user.

As a quick backgroud, I’ve been using Arch as my daily driver for 16+ years and I work in the Linux/DevOps/SRE realm professionally, I’m very opinionated about the software I use.

It’s only been about 2 weeks but absolutely love my initial experiences here.

One thing I quickly noticed is that even using the unstable channel, my Plasma desktop was a few point versions behind (and the new changes had some really nice features that I missed coming from Arch).

I’m fine with running containers or patching basic applications, but KDE/Plasma5 and a DE are different situations and after so many years of a rolling release system, I don’t know if I can wait for 6 month releases of my DE.

Ideally, I’d like to see something along the lines of what garbas mentioned - a stable core (server) edition with a different release cadence for DE/WM/applications, if that’s at all possible.

Thanks for all the hard work y’all!

bhipple · October 31, 2020, 4:59am

Perhaps we should consider just doing a rolling release? NixOS seems ideally suited for this, given that we have atomic OS rollbacks from the bootloader if anything goes wrong.

jonringer · October 31, 2020, 5:04am

We do, it’s called nixos-unstable

My recommendation of postponing 2 months would just allow us to cut from master, do some stabilization, then ship it. Instead of trying to make a previous DM LTS release work with the latest systemd + glibc + gcc

kamron_m · October 31, 2020, 5:22am

There is a lot of interesting discussion in this thread but I hope @jonringer’s suggestion doesn’t get lost in the conversation.

As a user of NixOS as a daily driver, I don’t see the shift from a XX.03 and XX.09 schedule to a XX.05 and XX.11 release schedule as a drawback to me. If the release managers think this change will make the release less taxing (which it seems they do), I absolutely defer to their experience and support this change.

7c6f434c · October 31, 2020, 11:32am

On the topic of dev workflow, I would want to add that a reproducible and scriptable enough development flow can often be translated (for small-ish deployments) into an option for ops people «learn details later, just get our exact CI version easily; feel free to get a CI-similar config too by just tweaking these lines». Ops workflow might end up not creating any easy translation into a good development workflow, though.

ari-becker · November 1, 2020, 12:39pm

Following caveats:

I run nixos-unstable on a laptop, with GDM / Gnome 3, mostly pretty vanilla. This combination has worked well for me with few issues in my day-to-day usage (I didn’t realize epiphany was broken and removed from the defaults in 20.09 because I don’t use epiphany ). I update and rebuild once a week, on the weekend.
I don’t run NixOS on production servers (at least, not yet… organization is too locked into legacy choices and too small to dedicate resources to changing this at this point in time)

Personally I don’t see the purpose of the stable distribution for desktop usage. I do think there’s a lot of value in long-term support (LTS), i.e. of the kind in Ubuntu, and if NixOS was Ubuntu I’d probably run LTS, but six months of support with one-month overlap to permit an upgrade is really not “long-term” as far as LTS goes. Additionally, I think that because NixOS’s declarative model dramatically reduces the risk of upgrades, the usual addition in value by virtue of LTS is dramatically reduced.

Running some kind of LTS release with overlays from unstable is probably the ideal; however, if the maintenance burden of stable is such that core maintainers are suffering from burnout, then I would posit that core maintainers should effect any change they deem necessary to reduce the burden of maintaining the stable release, as the stability improvements afforded by the stable release are relatively small and relatively less valuable (because of NixOS’s declarative configuration model), and this will be all the more true when flakes are finally no longer experimental and it’s much easier to simply update (and rollback) the nixpkgs pins of production systems when desired.

jonringer · November 1, 2020, 7:27pm

I think for most users, it’s a “safe” bet that the applications you want to use are available, and in the binary cache.

The breakages on unstable (although infrequent), can put many people off.

maxxk · November 1, 2020, 8:06pm

I think «Optimize for people’s time» option is the best. However, in previous discussions here I have seen an opinion that YY.11 may not be enough time to integrate the latest gnome and plasma, but it is worth trying. Also, if I understand correctly, major distributions will support shipped versions for at least 6 months which hopefully will make it easier to provide security updates in NixOS stable. But I also support the idea of freezing glibc, gcc, binutils etc. a month before the release month, which in case of YY.05 and YY.11 will be around the freeze time for Ubuntu and Fedora.

For another data point on usage:

As a desktop user I probably (reading this topic) represent a minority: I use NixOS stable as my primary system. I don’t use plasma or gnome, but use some applications from both, so it is important to have more up-to-date versions of application packages and the consistent set of packages overall. Qt update from 20.03 to 20.09 did break some things unfortunately.

As a production user (nixpkgs on ubuntu), we pin nixpkgs at a specific revision of stable and update the revision only to patch critical vulnerabilities or solve some immediate problems. Updating to next stable release usually is non-trivial (we have a large Python codebase with a number of out-of-nixpkgs packages, older versions of some packages and custom forks). It takes from couple of weeks to a month of background work after the release. We don’t use systemd from nixpkgs, so in our case it doesn’t matter. I could be wrong, but I think production users want stability and predictability of upgrades, not a newer versions. If newer versions are required, in most cases it will be provided in some kind of overlay.

7c6f434c · November 1, 2020, 8:51pm

Also, maybe not only the release date should be shifted, but branch-off should be 40–60 days before the hoped release date, not 20–40? Buys a few rebuild iterations when something tries to refuse to play nicely.

Ideally, of course, big changes would land a bit earlier in advance at unstable (as in: with all Hydra-built tests passing, not just merged), but maybe branch-off deadline will see less slip than «please leave some time for stabilisation before branch-off»

jonringer · November 1, 2020, 9:28pm

There’s a separate discussion around staging and staging-next. But having 500 largely untested staging commits land a few days or the day prior to the target branch-off did contribute to some of the delays.

jonringer · November 1, 2020, 9:35pm

Gnome 3.38 was merged into master before 20.09 was released, so I think that’s low risk for impacting a November release date. Plasma + kde + Qt, can also be stabilized before ZHF, one of the compounding issues with ZHF, is that you’re risk adverse to potential regressions at that point. But if we have a working plasma installation, then it’s easier to backport individual kde/qt fixes. Also, during ZHF, a lot of core contributors are “spread thinner” due to a lot of backports occurring in many different ecosystems.

Another way to word my proposal is, “The desktop experience in nixpkgs-unstable should be pretty good by the time November rolls around, so we branch-off, we can just focus on the general package regressions in master and not have to try to fix large package ecosytems while doing stabilization.”

jtojnar · November 2, 2020, 4:50am

It was not merged into master, it was merged into staging. And only approximately one day before the 20.09 release. As a GNOME maintainer, I am not really confident of including GNOME 3.38 before running it on production system at least a month.

Yes, they do, each cycle there are few issues that are discovered by Arch, Fedora or Ubuntu people and for which we do cherry-pick patches (less than ten, this cycle, IIRC). Some of them are included into .1 release two weeks after .0 but some of them don’t.

But majority of the issues, and the hardest ones to solve come from integration with NixOS. For example, we have decided to not include the new search in GTK file dialogue for now because waiting for that would delay the merge even further. See GNOME 3.38 · GitHub to get some idea what GNOME update entails.

Hopefully, we will solve these before the end of November but, even if we do yy.11 releases, the GNOME changes would have missed the branch-off point.

While not that gcc packages, GNOME developers do develop some more widely used packages like gtk-doc or Vala, which sometimes do break stuff. GNOME can depend on the latest version of those. And I would not be surprised if they even needed last version of systemd, PulseAudio or other freedesktop.org projects since their developer pools overlap, and GNOME developers often add new features to those projects, driven by the needs of GNOME desktop.

jonringer · November 2, 2020, 5:13am

You bring up a lot of good points.

But I would just like to point out that our 5.18.5 plasma and 3.36 gnome which we released with 20.09 still have many rough edges. Even a week after releasing, I’m still largely doing, “damage control”.

The point I’m trying to make is, we would still be in much better of a state than we were with the early September branch off.

Although there still hasn’t been any 5.20 plasma development on master; if we “get things rolling” with a well-structured project/issue (similar to Convert remaining python2 applications over to python3 · Issue #101964 · NixOS/nixpkgs · GitHub), and maybe a special branch+ hydra job, then it “easier” to distribute the load with the community.

jonringer · November 4, 2020, 8:44am

I’ve been looking into the release cycles of both Plasma and Gnome 40.0. Gnome follows a 6 month release schedule, however, plasma has a 4 month release cadence for their non-LTS releases. This means we will have a three month period between stable releases where our stable release will have an unsupported version of plasma. Personally, I don’t think this is much of an issue; unless there’s a compelling security reason, the plasma desktop should still be very usable. I would much rather have people saying, “why isn’t plasma at the latest”, than “why can’t I log out”.

Doesn’t plasma release an LTS version? why don’t we use that?

They do, but even with the 20.09 release, we had issues with systemd compatibility where these regressions did not occur on the later plasma releases as they were developed against more recent versions of systemd (logout, and certain functionality was missing or broken). Subtle incompatibility issues between these desktop managers and fundamental libraries will likely always occur, however, the greater the divergence, the more likely we will have to develop in-house solutions. Thus delaying releases more.

As I’m not a user of gnome or plasma, I would like some feedback from those users and maintainers about their thoughts. My main interest is being able to release in a timely manner, and the Desktop experience is usually people’s “first impression” of NixOS. With the YY.05 and YY.11, I think it’s very doable to have a “mostly polished” experience going into the release branch-off. Which should save everyone a lot of headache, pain, sweat, and tears.

TL;DR
Going with YY.05 and YY.11 release dates with non-LTS versions of plasma and gnome will have an awkward window for plasma where there will be a 3 month window where it will be unsupported. However, I don’t think this will be much of an issue as long as we can provide a very usable desktop experience.

Yarny · November 4, 2020, 4:41pm

Hi, user/admin here. As I administer some machines for very non-tech-savvy users (using plasma and xfce, not gnome), I would also opt for a stable plasma installation and would not mind if it is not the newest version. After a new NixOS release got published, I usually need some weeks to prepare/update individual configurations for my users. Beyond that, their systems run on auto-update. New features in new plasma versions are rarely missed; stability is more important.

This also holds for most other software: I don’t mind having older versions of libreoffice, latex, but also server software like sshd or httpd, although these probably need more attention with respect to security updates.

I hope this data point helps finding a solution. Thanks for all your efforts!

austin · November 5, 2020, 2:31am

As someone coming from Arch, my ideal would be a very good, stable unstable . Basically inverting what I understand as the current focus, where unstable is less tested and the releases like 20.03, 20.09 are supposed to be more tested.

That said I’m probably in the minority there. More realistically as a Gnome user I’d prefer working slightly out of date Gnome versus up to date but broken in strange ways Gnome. Hope that helps.

7c6f434c · November 5, 2020, 4:00pm

As someone coming from Arch, my ideal would be a very good, stable unstable . Basically inverting what I understand as the current focus, where unstable is less tested and the releases like 20.03, 20.09 are supposed to be more tested.

I think in this discussion people like me who care nothing about releases just do not see any reason to participate! People who for whatever reason want to have releases with desktop support discuss how to do this better, I do not expect to either do anything for releases nor benefit from releases, so I just do not see anything useful I could say…

I will continue building my system from random master snapshots and fix what breaks for me…

That said I’m probably in the minority there. More realistically as a Gnome user I’d prefer working slightly out of date Gnome versus up to date but broken in strange ways Gnome. Hope that helps.

OK, Gnome is harder… StumpWM-related breakages I can fix literally on my own in a day…

jonringer · November 5, 2020, 4:41pm

If I was dictator, I would just post-pone our staging cycle (except for regression fixes) during ZHF and have ZHF target master. Branch-off from a really good point in master, do a final week of backporting and QA, then releasing.

Then do all the “risky” staging work after branch-off. Or at least until we have a practical staging-next stabilization which is treated more like a “release” with blocking jobs.