Overriding .. options? (or more)

Hi :wave:

Despite running nixos for ~5 years now I am getting stuck from time to time. So this is a question to mostly understand how can I debug this issue rather than finding out a quick solution (might even not be one). Thanks for reading!

  • I wanted to enable bumblebee.
  • It wanted to install a broken (old) nvidia version nvidia-x11-390.157. The issue is tracked here.
  • One of the suggestions in the issue was to extend the kernel to declare the correct version of the package, like:
  boot.kernelPackages = pkgs.linuxKernel.packages.linux_6_1.extend (final: prev: {
    nvidia_x11 = prev.nvidia_x11_legacy390;
  });
  • Since I need to use the latest kernel (thx to the new laptop), I applied like:
    kernelPackages = pkgs.unstable.linuxPackages_latest.extend (final: prev: {
      nvidia_x11 = prev.nvidia_x11_beta;
    });

which, in my mind, made sense. But looks like wrong since it didn’t change the error, still wanted to install the same broken version.

  • Then I tried to copy the bumblebee module and imported it locally in my config. Changing even the option name and the driver version. But system also did not care about it (even though I enabled only fixedbumblebee option this time) and complained about the same broken version (it didn’t fail because of the option name etc. so this must mean I got it imported without issues).

I am missing something fundamental here. Could you point out why I can’t make it work with these 2 methods?

Thanks again!

1 Like

What you’re trying to do now is replace your whole system’s driver with 390 maybe not, but the user who suggested this had 390 as the system driver, already, so this might not work for this case.

The main problem according to GitHub issue the is i686 compatibility. A simple workaround is to replace bumblebee’s nvidia_x11_i686 instead:

nixpkgs.overlays = [
    (
      self: super:
      let
        nvidia_i686 = pkgs.linuxPackages.nvidia_x11.lib32;
      in
      {
        bumblebee = super.bumblebee.override { nvidia_x11_i686 = nvidia_i686; };
        primus = super.primus.override {
          primusLib_i686 = pkgs.pkgsi686Linux.primusLib.override { nvidia_x11 = nvidia_i686; };
        };
      }
    )
];

The OP in the issue did not test this, and just disabled i686 support altogether, which is another solution and can be done with:

nixpkgs.overlays = [
   (self: super: {
     bumblebee = super.bumblebee.override {
       nvidia_x11_i686 = null;
       libglvnd_i686 = null;
     };
     primus = super.primus.override {
       primusLib_i686 = null;
     };
   })
];

Thanks for the explanation. I missed the compatibility since it was failing even on my previous steps.

Applying your suggestion caused error: Package ‘nvidia-x11-560.31.02’ in /nix/store/h60m1fwahjd2mv6gsg77ji3vb4gpj4dk-source/pkgs/os-specific/linux/nvidia-x11/generic.nix:238 is not available on the requested hostPlatform: but it’s pretty understandable.

Weird thing is, even though I add { nixpkgs.config.allowUnsupportedSystem = true; } it doesn’t accept it. I still can’t grasp why my explicit selections on the config does not apply fully when I want to override something around nvidia stuff.

Setting i686 stuff to null as 2nd suggestion seems to work though, at least nixos-rebuild does not complain.

Forgot to enable bumblebee, so I though it was working :upside_down_face:

This did compile, though. Can you test if it works?

    (
      self: super:
      let
        nvidia_i686 = pkgs.linuxPackages.nvidia_x11.lib32;
      in
      {
        bumblebee = super.bumblebee.override { nvidia_x11_i686 = nvidia_i686; };
        primus = super.primus.override {
          primusLib_i686 = pkgs.pkgsi686Linux.primusLib.override { nvidia_x11 = nvidia_i686; };
        };
      }
    )
1 Like

Huh, it compiles, but gets a weird issue when I want to use primus:

Aug 14 17:16:15 splinter kernel: nvidia: module license 'NVIDIA' taints kernel.
Aug 14 17:16:15 splinter kernel: Disabling lock debugging due to kernel taint
Aug 14 17:16:15 splinter kernel: nvidia: module license taints kernel.
Aug 14 17:16:16 splinter kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 511
Aug 14 17:16:16 splinter kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  555.58.02  Tue Jun 25 01:39:15 UTC 2024
Aug 14 17:16:16 splinter acpid[1402]: client connected from 5342[0:1]
Aug 14 17:16:16 splinter acpid[1402]: 1 client rule loaded
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (WW) Warning, couldn't open module mouse
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE) Failed to load module "mouse" (module does not exist, 0)
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE) No devices detected.
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE)
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE) no screens found(EE)
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE)
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE) Please also check the log file at "/var/log/X.bumblebee.log" for additional information.
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE)
Aug 14 17:16:16 splinter bumblebeed[1417]: [XORG] (EE) Server terminated with error (1). Closing log file.
Aug 14 17:16:16 splinter bumblebeed[1417]: X did not start properly

mouse not found… alright… :sweat_smile: I think it’s trying to create a whole new X server with missing info?

I will try the production driver I guess…

P.s. The other config which I mentioned as “working” also failed another way:

Aug 14 17:04:50 splinter kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 511
Aug 14 17:04:50 splinter kernel: NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:28a1)
                                 NVRM: installed in this system is not supported by the
                                 NVRM: NVIDIA 555.58.02 driver release.
                                 NVRM: Please see 'Appendix A - Supported NVIDIA GPU Products'
                                 NVRM: in this release's README, available on the operating system
                                 NVRM: specific graphics driver download page at www.nvidia.com.
Aug 14 17:04:50 splinter kernel: nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
Aug 14 17:04:50 splinter kernel: NVRM: The NVIDIA probe routine failed for 1 device(s).
Aug 14 17:04:50 splinter kernel: NVRM: None of the NVIDIA devices were initialized.
Aug 14 17:04:50 splinter kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 511
Aug 14 17:04:50 splinter bumblebeed[1601]: Could not load GPU driver

despite it’s actually supported according to release notes…

I wasn’t using nvidia for long time but apparently not much changed.

Maybe try using the package from the nvidia module, instead?

    (
      self: super:
      let
-       nvidia_i686 = pkgs.linuxPackages.nvidia_x11.lib32;
+       nvidia_i686 = config.hardware.nvidia.package.lib32;
      in

Thanks, but didn’t change the result. Still no devices detected.

Why don’t you just set up hyprid graphics with PRIME? it’s much simpler than bumblebee and it works with recent drivers.

That’s exactly what is happening.

The mouse module missing is a known issue.
There was an attempt to fix this here: bumblebee: add xf86inputmouse to x11 modules by abmantis ¡ Pull Request #282646 ¡ NixOS/nixpkgs ¡ GitHub, but that is not the reason it fails to start.
And not displaying two mice is probably a feature.

But, regarding the “no screens found” error: I have the following in my configuration:

      bumblebee = pkgs.bumblebee.override {
        # Fixes:  [XORG] (EE) no screens found(EE)
        extraNvidiaDeviceOptions = ''
          BusID "PCI:01:00:0"
        '';
      };

Maybe this helps?
According to the log of your second attempt, your GPU seems to be sitting at the same bus address as mine:

The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:28a1)

2 Likes

Eh, I am coming from prime. My experience was a lot different on that front. Kernel panics, mixed up graphics (like 1/3rd of the sentence is not even rendered), lagging screens (e.g. no refresh of terminals until I press a button or move mouse over it) etc…
At least I can trust what is on the screen with intel graphics.

1 Like

Thanks for this. I’ve added, but not directly fixed the issue. The warning was also explaining that it detected the PCI bus and device correctly but for some reason it’s not supported.
At this point I’m just randomly poking all variables :smiley:

Alright I caved. Switched to PRIME again. Maybe the new driver will spare me for hangs/kernel panics.

Thanks again everyone, I now know a bit more about nix… and pain caused by nvidia :sweat_smile:

2 Likes

There is a reason why Nvidia is at the top of the Nix fuck-up assessement form :melting_face:

3 Likes

What error were you getting?

It was something like “the package is broken” and the version it tried to compile was the still old 390 version even though my extend attempt.

It’s caused by the nvidia package itself, if you’re using a Linux kernel version > 6.1 it will be marked as broken. Maybe you can try to copy the package locally and remove this line:

Yeah but this is the version I do not like to use. That one doesn’t support my card since it’s too old. I was trying to use the latest version.

Anyway, thanks for the effort. I am relatively happy with PRIME after recent minor upgrades on kernel and the driver. Not sure if I’d like to dive into bumblebee anymore (since it looks stale a bit).