Chroot 23.11 within 22.11 bare-metal because Nvidia

Following-on from my previous question about current support for old Nvidia hardware, I’ve realised that:

  • Current kernels don’t support the legacy 390 series closed-source Nvidia drivers, which are the only drivers capable of handling the Optimus video hardware on a Lenovo W520.
  • Nouveau kind-of works but not satisfactorily; although all three displays are driven, they are generally not at full resolution and any attempt to correct that (xrandr --newmode etc.) lead to all three screens being blank; as does any attempt to rearrange the displays according to their physical layout.

so I was thinking: the latest NixOS that works satisfactorily is 22.11. Could I install a ‘base’ minimal on-the-metal 22.11, then install a chrooted 23.11 within that, so I can keep current with all the applications I want? Does NixOS provide any support for doing that?

It would be quite tricky as the display would be handled by the 22.11 on-the-metal installation, and the display manager and desktop, by the 23.11 chrooted installation, so I couldn’t use a standard systemctl display-manager unit that expects to start the X display and then run lightdm, Cinnamon etc. I’d have to configure it so that the 22.11 base starts the X display, then the 23.11 runs the desktop on that.

Has anyone ever done anything like that, and is there any documentation on doing so?

1 Like

Sounds a bit like a cursed setup. Maybe it would be easier to just get the kernelPackages with the older nvidia from an older nixpkgs and not have a chroot?

2 Likes

I’m up for whatever works. How would I do that?

1 Like

Add a second flake input/channel for the other nixpkgs version, and set boot.kernelPackages to a kernel package from that nixpkgs.

If you’re going to be doing this long term, I’d recommend packaging an old kernel downstream. IME you end up with subtle bugs with the above. Maybe check if someone maintains a super LTS version that gets security fixes whose source you can pull.

The only real solution is to phase out this laptop and buy something that has usable open source drivers next time, of course.

1 Like

What makes you think that?

Nvidia legacy 390 marks itself as compatible with anything <6.2. The default kernel in 23.05 is 6.1.

Besides, we still have LTS kernels up until 4.19 available in the current release available for use exactly for things like this.

i have similar issue with my GT1030 nvidia driver 535x compiling with the latest traditional kernels on nixos kde (2305).

The only work around i found that works for now is to use the xanmod kernel, here’s my snippet for it:

https://github.com/tolgaerok/nixos-kde/blob/1adc6be6392b55e825767188556c5b6e74471677/core/modules/system-tweaks/kernel-upgrades/xanmod.nix#L9

I also use the production Nvidia drivers (535x)

https://github.com/tolgaerok/nixos-kde/blob/1adc6be6392b55e825767188556c5b6e74471677/core/gpu/nvidia/nvidia-stable-opengl/default.nix#L25

Change to : package = config.boot.kernelPackages.nvidiaPackages.legacy_390 and see if that works.

Feel free to snoop around the rest of my configs: https://github.com/tolgaerok/nixos-kde/tree/main

I haven’t migrated over to 2311 as 2305 is catering for all my needs atm

Alternatively, visit:

https://download.nvidia.com/XFree86/Linux-x86_64/

and select the appropriate 390 to suite. Use my commented out section in my snippet to download the specific driver for your card:

https://github.com/tolgaerok/nixos-kde/blob/1adc6be6392b55e825767188556c5b6e74471677/core/gpu/nvidia/nvidia-stable-opengl/default.nix#L44

What makes me think that? Well, for one, it doesn’t work. It doesn’t even start an X session with 23.11. And on Gentoo, legacy 390 blocks kernels > 5.15.
I reckon my best shot is to work out (or be told) how to hold the kernel version down.

1 Like

What’s the error in the journal?

boot.kernelPackages = pkgs.linuxPackages_5_15;
1 Like

I tried 23.11 with downgraded kernel and legacy Nvidia drivers; my config. is here.
The machine boots and eventually goes to a blank laptop display with just a solid (not blinking) text cursor at the top left. From /var/log/lightdm here are x-0.log and lightdm.log. x-0.log really does end with the line (EE) and nothing else.
The keyboard doesn’t work and ctrl-alt-F1 etc. do nothing; I had to ssh in to get the above information.
Any ideas?

1 Like

As mentioned, take a look into the journal.

The lightdm log indicates that X segfaulted which would also explain the empty last log line. You should be able to see that happen in the journal and kernel log.

You could try to get a backtrace using coredumptctl after it crashes to further analyse but I do not believe this is something you could do anything about. You’re using a known troublesome hardware configuration (optimus) in combination with an EOL driver that was hacked up to work with somewhat modern kernels. Don’t expect it to work.

One more thing you could try is to use an even older kernel. Perhaps the oldest one we have and maybe one in the middle.

Yep, as you suggest: coredump.

I tried downgrading to 5_10 – same problem.

4_14 was the only other choice but when I ran nixos-rebuild switch:

error: linux 4.14 was removed because it will reach its end of life within 23.11

But never mind, because as it happens I have a nominally identical laptop running 22.11 and it works. I just checked on that: kernel 5.15, nvidia 390.151. This 23.11 is trying to install nvidia 390.157. That minor increment should not make a difference but I’d like to try 151 on this 23.11. How do I hold the minor version number down? I see from here that legacy_390 is just a package or short-hand reference of some sort but I’m not yet sufficiently familiar with Nix’s specification DSL to know how to use the declaration there as the basis of my own with a lower minor version number.

1 Like

Easiest is probably to revert those update commits touching the nvidia driver in a local Nixpkgs checkout of 23.11.

1 Like

Now I am highly perplexed. I’ve made /etc/nixos/* as similar as possible to those on my working system (differences are stuff like filesystem UUIDs, version of PHP and required changes to users.users.mounty.

It still boots to a blank screen with a steady text cursor, as described above. The files under /var/log/lightdm are the same, including the final (EE). But there is no coredump in journalctl -e. That’s the only difference. No coredump, everything else the same. I am perplexed.

The only other difference is that the non-working system pulls in the Nvidia drivers version 390.157 and the old working system has 390.151. It could be that, or maybe it’s the version of GCC that’s used to build … something, during the nixos-rebuild run.

1 Like

I’m going to try a 23.11 chroot within a 22.11 bare metal. It seems there is no other way.

1 Like

Have you given this a shot? That will really be significantly easier IMO

Like, just do a git clone https://github.com/nixos/nixpkgs, get in there with git blame on the nvidia package file to figure out which commit did that (or use the GitHub UI if that’s too difficult), git reset the commit away, and then build your config from that checkout instead of the channel (I think nixos-rebuild -I nixpkgs=/path/to/repo does the trick).

Periodically rebase against upstream to get updates. NixOS being easily manipulated via git is like half the reason to use the distro over alternatives.

No cursed 22.11 + 23.11 mix that will barely work anyway, and your host gets to have actual security updates.

The crux really is knowing that the nvidia driver version is the problem, and that an older one works. Long-term you can consider just overriding the nvidia package in your configuration so you can just use channels as normal.

Well I suppose the answer is that I know chroot, but I don’t (yet) understand playing with Nix repositories. But I’m certainly not happy with a chroot solution so I’ll take the time to understand your recommendation. Thank you.

Wait. I don’t think this is going to work. Did you see Atemu’s comment in another post?
TL;DR Nvidia drivers 390.151 + kernel 5.15 + 22.11 do work: Nvidia drivers 390.157 + kernel 5.15 + 23.11 do not work.
That is why I think I need a chroot system; because 23.11 uses a newer and likely Nvidia-breaking glibc.
So unless it is possible to hold 23.11 down to glibc 2.35 (or possibly >=2.x and <2.38) I don’t think this is going to work.
If you think it is, where should I start? Can I install 23.11 and hold the version of glibc down?

1 Like

I updated to 23.05 (glibc 2.37) as a starting-point. The display seems to work alright but when IntelliJ IDEA is the only desktop app. running, the .cinnamon-wrapp process is consuming around 80% of one core, according to top, which compares unfavourably with 0.2% on the 22.11 machine.
Is it possible to move to 23.11 but hold glibc down to 2.37?
I need more help to understand how to do this with repo. manipulation. Is there a more detailed guide anywhere?

1 Like

In theory, yes. You can revert this commit: glibc: 2.37-39 -> 2.38-0 · NixOS/nixpkgs@e861529 · GitHub

In practice, this will mean running an older glibc than what anything has been tested against. You may run into issues with other repackaged binaries. You will also need to compile everything downstream (changing glibc is serious business), which will be painful on a decade-old laptop (for this, using another machine to build and deploying remotely, or using something like peerix may help, but this will probably remain very painful even with a faster build host).

I think you’d be best served by any generic git tutorial (no idea if that one in particular is good, I learned this stuff by osmosis), but I’ll try to take some time writing up a more detailed guide.

1 Like

I think this has run long enough. Even NixOS cannot make software work that is not designed for the situation. I’ll continue to use the laptops with the built-in Intel display, but get something else for multi-screen usage. Thanks Tlater and Atemu for your help but let’s draw a line under this.
BTW, Nouveau isn’t coming to the rescue. There’s been no significant movement on them for almost three years.