It’s a common problem for folks like myself who want to run the latest kernel that the proprietary Nvidia drivers available in the repository fail to build with their kernel version. In case of weak hardware, you usually spend quite a lot of time building different versions of the driver, only to find them all refuse to work. You can scout the issue tracker before trying to upgrade, wait till somebody figures out which version builds on newest kernel and only then proceed, yes, but I prefer that version control is left to the machine to handle. So would it be possible to include a list of compatible kernel versions (packages?) into the driver declaration? This way, if I try to upgrade too early, Nix can warn me about it instead of wasting resources building something that’s going to fail anyway. It could also suggest me switch to latest
if it’s proven to work with my current kernel version when production
fails to build. And throw in a switch to override the check if you know what you’re doing.
I must mention that I haven’t explored NixOS beyond extensive but overall simple configuration. Maybe the latest
driver is supposed to always work with the latest kernel, but hasn’t been timely updated.
Nvidia develop their drivers against the LTS kernel, so as long as you stick to that, you will never run into this incompatibility.
Most people should just use the single available kernel version that nvidia actually support.
There is only one valid use case for non-LTS kernels with nvidia, which is devices that need drivers for other components that only exist in newer kernels.
The majority of configs using latest
I see are in desktop systems with like 1060s and 3060s, neither of which will be in devices that need that treatment at this point, nor are the users likely to actually even be aware of what the newer kernels add. Stop getting distracted by the word “latest” and then complain about the literal consequences of your choices, please just use the LTS kernel (i.e., the default you get when not specifying _latest
or a specific version).
For the actual valid use case, people have previously petitioned for the most recent two kernels to stay in nixpkgs so that people in this spot can bide their time until nvidia push an update, but by nixpkgs security policy anything considered EOL upstream will be purged, and no exception was made here. In a way, your hardware is unsupported by nvidia, and NixOS likes to play close to what upstream supports, so I think this is a reasonable policy.
In that thread it was suggested that a latestCompatibleLinuxPackages
thing could be added, like what zfs had at the time. It was similar to what you’re suggesting, just with a bit more understanding of how the support direction works. The zfs version of this has since been deprecated because it sometimes meant downgrading kernels, which could result in unbootable systems, and people didn’t really understand it.
Unfortunately even as of 24.11 the default kernel version is 6.6, which doesn’t feature bcachefs that I use. The current official LTS is 6.12, which features the latest bcachefs version currently available in the kernel (at the very least, it’s very close). I’d like to keep it updated because this filesystem is evolving rapidly and I don’t want to encounter issues making giant leaps across its major updates when it can be upgraded and tested through daily use incrementally. It’s not like I really have a choice, considering that intermittent kernel versions get purged.
Not a big fan of this specific wording. I have to note that I do keep my kernel version fixed, and it’s not set to latest
. It’s just that 6.13 currently is also latest in the repo, and updating to it caused me some pain today, so I used that word.
You could then argue that my setup is pretty experimental and therefore doesn’t warrant special treatment by the mainline. And yeah, I can totally live with just checking the issue tracker to see if I can upgrade yet. But I find it silly that in such an environment as Nix, some simple version checks are still left for the user to perform - especially when the right kernel version is crucial for the driver.
Could you please elaborate what was going on with ZoL? I don’t think I understand enough to connect it to my question.
That’s fair; if you use stable you’ll have to explicitly pin to 6.12.
Intermittent non-LTS versions get purged. And again, that’s more to do with upstream decisions here.
We tried (a variation of) what you suggested for nvidia with zfs, which is also an out-of-tree kernel module, but it caused more confusion and breakage than the help it provided.
Though, I don’t see the issue with simply explicitly listing out the list of compatible kernel versions (other than maintenance effort, of course).
Still, I don’t entirely see the concern with sticking to the latest LTS here.
Yep, fair enough, that means that complaint is not aimed at you - you clearly have a valid use case. I’m sure plenty of folks who do misuse the kernel setting will want something like this too, though, and that’s to preempt and explain what the way to do it correctly is.
Well, the zfs issue shows exactly this; it’s not about the version check, but about the implicit contract nixpkgs provides with such a tracker. Since nix only tracks the latest kernel + the latest LTS (because anything else would mean potentially known vulnerable software being officially signed off on in nixpkgs), any time nvidia cannot compile against the latest kernel the property has to be switched to the latest LTS.
As a result, next time you nixos-rebuild
and reboot your computer, bcachefs is missing and it fails to boot. A clear case of backwards-incompatible breakage when you were following upstream’s rules perfectly; this should never happen. You also cannot update anyway because you have to wait until a bcachefs kernel + nvidia combo becomes available again. In the worst case there’s some critical CVE (think commercial users) and you now have to figure out how to deploy a custom kernel while potentially under active attack and when your sysadmin is asleep.
The reality is, for that period of time, there is no version that satisfies your requirements without breaking nixpkgs security policy, and nixpkgs cannot guarantee that this will never happen. Really, nixpkgs should never have pretended to support your use case, and left the responsibility of hacking around these limitations to you in the first place.
This is precisely why the zfs passthru
kernel was deprecated.
That said, if you want to have a NixOS that satisfies your requirements, you can pretty trivially package your own kernel, and then use something like nvfetcher to keep track of new releases. That way you completely escape any concerns about backwards compatibility, and nixpkgs doesn’t have to be responsible for it either, but you still get most of the convenience.
If anything, I think we should consider making explicit in nixpkgs that using nvidia with anything but the LTS kernel is unsupported. Require a hardware.nvidia.forceIncompatibleKernel = true;
if the kernel version doesn’t match current LTS or such; currently backwards-incompatible breakage is easy to achieve just by switching kernels. Oh, hey, look, the zfs folks have arrived at the same conclusion. One day we’ll figure out how to generalize across these two out-of-tree modules so we don’t have to find solutions like this organically twice.
I just don’t see why this would have to be tied to nixpkgs at all. If anyone wants to maintain such a list, just write a lil’ readme in a github repo and list kernel vs driver version in a neat matrix. Then post that in Announcements. I’m sure other ecosystems would also appreciate the existence of such a tracker.
nixpkgs still couldn’t use it for anything, but you could filter nvfetcher
versions with it, and then you would have the automation you want.
Could even automate it further by building nvidia drivers against linux kernels with nix and making github actions basically post the results. But yeah, someone has to do that.
I think wanting to use bcachefs is valid; though indeed, if you don’t think the version in the current LTS available on NixOS is stable enough it’s probably not advisable to use it on a system with an nvidia GPU, at least for the moment.
I’m honestly curious why 6.12 isn’t in stable yet, I forget what the kernel release schedule for NixOS is every time I run into this.
Let me explain the situation that we had for ZFS.
I had some issues with that in my nixos carrier until i opted to use LTS explicitly.
There was an attribut on the zfs module called latestCompatibleLinuxPackages
, that linked the latest kernel, that is compatible upstream with ZFS (so the upper border of their compatibility matrix).
Most of the times the newest kernel zfs supports is the previous version of the stable kernel (matching kernel.org naming here).
Also important to know, linux kernel is supported about ~+1 month after a new kernel gets released, unless its an LTS kernel.
A upsteam out of tree module has now basically that time frame to become compatible and create a release, to say ouf ot the EoL area.
In addition there is a quite quick removal for EoL packages on nixpkgs (which include kernels).
For ZFS that resulted in the situation, that ZFS did support up to 6.9, 6.10 gets released, 6.9 gets EoL and removed, but ZFS did not yet support 6.10, thus its latestCompatibleLinuxPackages
falled back to 6.6 (as it was the most recent kernel in nixpkgs that was supported).
If someone now used boot. kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;
in their config that resulted into downgrading from 6.9 → 6.6, which breaks systems if you rely on content that are not available on 6.6.
After a longer discussion into all directions, latestCompatibleLinuxPackages
was removed from the module and the module is set as broken for kernels that are to new.
About the issue itself:
nvidia module seems to do similar but just for the lower border, maybe it could be set for the upper border also, then it would fail on eval and not on building if i remember correctly.
Wait, nvidia does this? I can’t see a passthru.latestCompatibleLinuxPackages
in the package or anything similar in the module, where did you spot it?
I meant to set broken kernels. Its using kernelAtLeast
for broken attribute.
*gulp*
It does feel like an implementation fault though. In my mind, a similar peg wouldn’t prevent you from going backwards, but it also wouldn’t downgrade you by itself unless you tell it to. It’s to leave you with warnings and choices rather than broken configurations. But I guess simply staying on LTS is good enough. Poor ZoL folks though.
Well that’d be quite a sad world if we all had to write our own scripts for everything! But indeed my case is a bit too complex for native support, versioning both the kernel and the drivers and the filesystem at the same time.
This looks like a nice solution. To live on the edge, flip switches one and two. I think this will work for the majority of people, and provide more opportunity for the naughty to reconsider.
Unfortunately I’ll have to keep juggling like this until bcachefs becomes stable enough to be usable at the next LTS release. Now that I’m on 6.13, I hope it stays around long enough until 6.14’s bcachefs and Nvidia mature. That is also supposed to be bcachefs’ last major disk format update, so the next updates should be less impactful to migrate to, and I might start settling on LTS.
Thanks for the insights.
That’d require keeping state, so implementing that would be quite ugly. You’d have to abort at activation time, at which point things get problematic because the activation script is no longer idempotent and all of NixOS’ promises fall apart.
But yeah, I feel your pain. I was somehow stuck with using non-LTS kernels for two years before my motherboard’s bluetooth driver hit LTS.
It’d be nice if nixpkgs kept more than the latest kernel and the “peg” could be implemented without such a drastic difference in kernel version or such, but well, I’ve jammered on about broken contracts for long enough.