Nvidia open breaks hardware acceleration

Hi everyone, I’m on 25.05 and when I switch hardware.nvidia.open to true in my config, hardware acceleration stops working, if I run the command vainfo I get this error:

Trying display: wayland libva info: VA-API version 1.22.0 libva info: User environment variable requested driver 'nvidia' libva info: Trying to open /run/opengl-driver/lib/dri/nvidia_drv_video.so libva info: Found init function __vaDriverInit_1_0 libva error: /run/opengl-driver/lib/dri/nvidia_drv_video.so init failed libva info: va_openDriver() returns 1 vaInitialize failed with error code 1 (operation failed),exit

These are the nvidia related parts of my config:

# NVIDIA
  
  # Enable OpenGL
  hardware.graphics = {
  	enable = true;
  	};

  # Load nvidia driver for Xorg and Wayland
  services.xserver.videoDrivers = ["nvidia"];
  
  environment.variables = {
    GBM_BACKEND = "nvidia-drm";         
    __GLX_VENDOR_LIBRARY_NAME = "nvidia";  
    MOZ_DISABLE_RDD_SANDBOX = "1";         
    LIBVA_DRIVER_NAME = "nvidia";
    NIXOS_OZONE_WL= "1";
    WLR_NO_HARDWARE_CURSORS = "1";
    MOZ_ENABLE_WAYLAND = "1";
    NVD_BACKEND = "direct";
    XDG_SESSION_TYPE = "wayland";
  };

  hardware.nvidia = {

    # Modesetting is required.
    modesetting.enable = true;

    # Nvidia power management. Experimental, and can cause sleep/suspend to fail.
    # Enable this if you have graphical corruption issues or application crashes after waking
    # up from sleep. This fixes it by saving the entire VRAM memory to /tmp/ instead 
    # of just the bare essentials.
    powerManagement.enable = true;

    # Fine-grained power management. Turns off GPU when not in use.
    # Experimental and only works on modern Nvidia GPUs (Turing or newer).
    powerManagement.finegrained = false;
    
    #dynamicBoost.enable = true; # Dynamic Boost
    
    nvidiaPersistenced = false;

    # Use the NVidia open source kernel module (not to be confused with the
    # independent third-party "nouveau" open source driver).
    # Support is limited to the Turing and later architectures. Full list of 
    # supported GPUs is at: 
    # https://github.com/NVIDIA/open-gpu-kernel-modules#compatible-gpus 
    # Only available from driver 515.43.04+
    # Currently alpha-quality/buggy, so false is currently the recommended setting.
    open = true;

    # Enable the Nvidia settings menu,
    # accessible via `nvidia-settings`.
    nvidiaSettings = true;

    # Optionally, you may need to select the appropriate driver version for your specific GPU.
    package = config.boot.kernelPackages.nvidiaPackages.latest;
  };

and these are the packages I installed:

# Nvidia
	vaapiVdpau
	libvdpau
  	libvdpau-va-gl 
 	nvidia-vaapi-driver
 	vdpauinfo
	libva
        libva-utils

This is what nvidia-smi returns if it can help:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77                 Driver Version: 565.77         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2060 ...    Off |   00000000:29:00.0  On |                  N/A |
| 29%   27C    P8             14W /  175W |    1258MiB /   8192MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1686      G   /run/current-system/sw/bin/Hyprland           917MiB |
|    0   N/A  N/A      1741      G   Xwayland                                        2MiB |
|    0   N/A  N/A      2020      G   kitty                                          48MiB |
|    0   N/A  N/A      2236      G   ...-firefox-134.0/bin/.firefox-wrapped        221MiB |
+-----------------------------------------------------------------------------------------+

If I leave the open option set to false everything works.
What can I do?

What do you mean by “installed”?

Also is there a reason you want the open drivers specifically?

1 Like

They should be preferred these days, so not using them would be the strange decision.

That said, all those packages, even if properly configured, probably interfere with each other. Most of them are for intel anyway. If you installed them with nix-env, please remove them (and don’t use nix-env on NixOS anyway). It doesn’t look like this ever could have worked to begin with to me.

Your config should contain exactly this:

hardware.opengl.extraPackages = [
  pkgs.nvidia-vaapi-driver
];

environment.variables = {
  NVD_BACKEND = "direct";
  LIBVA_DRIVER_NAME = "nvidia";
};

Of course, for wayland support in general you still need these variables: dotfiles/nixos-modules/nvidia/default.nix at 561931560d2c12e81f139ef8c681e6d99fc6c54e · TLATER/dotfiles · GitHub

Note that VA-API on Nvidia is currently very limited. With some additional settings you can get it to work on Firefox, but e.g. Chrome cannot use it.

This is what I add for Firefox support:

environment.variables.MOZ_DISABLE_RDD_SANDBOX = "1";
programs.firefox.preferences = {
  "media.ffmpeg.vaapi.enabled" = true;
  "media.rdd-ffmpeg.enabled" = true;
  "media.av1.enabled" = false; # Won't work on the 2060
  "gfx.x11-egl.force-enabled" = true;
  "widget.dmabuf.force-enabled" = true;
};

Disabling the RDD sandbox is required, but note that this means that the rendering thread escapes the sandbox. If some crafty website manages to find a way to escape that with just WebGL & co., this significantly reduces your browser’s security.

This requires programs.firefox.enable to function as well, so make sure you install that with the option and not home-manager or nix-env

2 Likes

Good to know, in that case, the comment in their config (“false is currently the recommended setting”) seems misleading :thinking:

1 Like

Yep. Don’t blame you @JamesSunderland, nvidia configuration has been changing every 3 months for the last 3-4 years or so and there’s a lot of outdated or just bad advice floating around. That comment may well be bitrot, too.

The state of the nvidia drivers is a mess. Only mildly related, but I’ve been test-driving a start to a general-purpose “make nvidia work” module. I’d like to have an option where you can just set whether you have an iGPU and the GPU name, and the module figures out the rest according to current crowd-sourced nvidia config knowledge. Maybe integrate it into the nixos-generate-replacement-to-be.

I should really get somewhere with that, but my pet projects never go anywhere, so don’t expect anything anytime soon. Time and whatnot.

3 Likes

That doc was written long before NVIDIA announced they would open source for version 560 onward, the wiki page has not been updated to reflect since v550 was in production and stable repo at NVIDIA. It could use a tune up at this point.

But for what its worth, i never got any driver past v535 + rcu to work correctly, 545 totally smashed, 550 mostly ok with artifacts, and 560,561,563,565 fail kernel module compile at rebuild.

Ive also seen and heard alot of complaints about the stability of the newly released versions aside from my own experiences.
NVIDIA is just a shit company for driver support on Linux.
EDIT: I still cant use 3d accel in discord, also completely borked. Will only show a black screen unless i launch it with x11 as an env variable. GPU accel is completely borked for me on any chromium based framework. Its insanely frustrating at times.

The compilation failing still makes it sound like you have an issue with the kernel you’re compiling against, i.e. it sounds like you’re compiling an older driver version than you think you are (which is what would make the rcu patch necessary). After all, the kernel and nvidia source code we’re compiling cannot differ, unless you broke sha256 or your memory is faulty, or your configuration is not what you think it is. Make sure you update your hashes correctly, if you specify the wrong hash nix will just reuse the old driver sources.

If that’s still not it, maybe try installing from scratch to a clean partition with a really trivial config to reproduce it?

I don’t think it’s too surprising that you’re seeing known bugs with a (two?) year-old version of the driver not magically disappearing. Nvidia have been putting a lot of time into it since they’ve hit the AI marketing goldmine, and at least IME the modern driver actually finally works for normal desktop use, even if configuration is still messy on account of it being an out-of-tree module. They’ve also at least announced an attempt to merge it, at which point only the mesa incompatibility would be a problem. I wouldn’t exactly call this an act of newfound selfleseness, especially since GPUs from before ~2018 will remain broken forever, but at least the money is finally flowing such that we benefit from it a little.

Either way, this is a bit besides the point of this topic, we should move this part of the discussion elsewhere.

Hi, first of all thank you for your response, those packages were installed as system packages through my configuration.nix file.
I removed the unnecessary ones and kept only libva, libva-utils and nvidia-vaapi-driver, and these are the env variables that I left:

environment.variables = {
    GBM_BACKEND = "nvidia-drm";         
    #__GLX_VENDOR_LIBRARY_NAME = "nvidia";  
    MOZ_DISABLE_RDD_SANDBOX = "1";         
    LIBVA_DRIVER_NAME = "nvidia";
    #NIXOS_OZONE_WL= "1";
    WLR_NO_HARDWARE_CURSORS = "1";
    #MOZ_ENABLE_WAYLAND = "1";
    NVD_BACKEND = "direct";
    #XDG_SESSION_TYPE = "wayland";
  };

But I still run into the same issue after rebooting, meanwhile if I switch the option open to false everything works perfectly:

~ vainfo
Trying display: wayland
libva info: VA-API version 1.22.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /run/opengl-driver/lib/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.22 (libva 2.22.0)
vainfo: Driver version: VA-API NVDEC driver [direct backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileHEVCMain               :	VAEntrypointVLD
      VAProfileVP8Version0_3          :	VAEntrypointVLD
      VAProfileVP9Profile0            :	VAEntrypointVLD
      VAProfileHEVCMain10             :	VAEntrypointVLD
      VAProfileHEVCMain12             :	VAEntrypointVLD
      VAProfileVP9Profile2            :	VAEntrypointVLD
      VAProfileHEVCMain444            :	VAEntrypointVLD
      VAProfileHEVCMain444_10         :	VAEntrypointVLD
      VAProfileHEVCMain444_12         :	VAEntrypointVLD
nvidia-smi
Tue Jan 14 12:33:54 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77                 Driver Version: 565.77         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2060 ...    Off |   00000000:29:00.0  On |                  N/A |
| 29%   32C    P0             41W /  175W |    1544MiB /   8192MiB |      5%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1687      G   /run/current-system/sw/bin/Hyprland          1043MiB |
|    0   N/A  N/A      1744      G   Xwayland                                        2MiB |
|    0   N/A  N/A      2057      G   ...-firefox-134.0/bin/.firefox-wrapped        173MiB |
|    0   N/A  N/A      2157    C+G   ...q-firefox-134.0/lib/firefox/firefox        209MiB |
|    0   N/A  N/A      3316      G   kitty                                          44MiB |
+-----------------------------------------------------------------------------------------+

It’s weird, on Arch for example hardware acceleration works without any problem with the open modules. I guess for the time being I will keep the option set to false.

Very interesting, that would be extremely helpful! I’ll look forward to it

That makes it sound like you put the vaapi driver in environment.systemPackages. Did you look very closely at the options I set? libva itself also should not be necessary, it will be pulled in by the closures of any packages that need it.

Interesting, for sure. The open driver is packaged very differently, and part of this packaging is bundling that shared library and writing a configuration file that will point all kinds of applications at the right path. I have a suspicion that putting the va-api driver not in the correct option will expose it to the wrong configuration file, and therefore make it attempt loading incorrect libraries. Hence my question to double check that.

If that’s not it, I think you may need to figure this out with strace - been there before, back when there were some version mismatches in what nvidia and NixOS bundled of some really low level graphics library.

Yes you are right, sorry if I wasn’t clear. I am dumb and when I installed libva-utils and nvidia-vaapi-driver using the option you provided, I thought something went wrong because running vainfo was returning an error but that was simply because it wasn’t in my path anymore.

Anyways, after that I tried again to set open = true; but the error persists. I am not familiar with strace but when I have a little bit of time (and patience :joy:) I’ll look into the route you suggest.

Thanks again for your help.

Great effort, thanks for putting the work in! I have trouble getting firefox hardware accel to properly work aswell. Currently I don’t even get the hardware page renderer and it falls back to the software renderer, forget about video codecs…

My config: config.nix · GitHub

One thing I don’t seem to get a good answer on is if currently x11 or wayland is more reliable for firefox hardware accel with nvidia. Especially Youtube accel is quite important for me (running a 3080 Ti) and I don’t care that much about other tools or plasma gimmicks or the likes. Do you have a recommendation? Does accel work fine for you on wayland?

It’s excellent Nvidia finally open sourced the driver.

I tried to get the open source driver working on NixOS, and after banging my head for a few hours, I just went back the 565.77 driver. It seems to work ok.

[das@t:~]$ nvidia-smi 
Wed Jan 15 16:51:16 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77                 Driver Version: 565.77         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro T2000 with Max-Q ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   48C    P8              5W /   35W |     236MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3995      G   ...db-gnome-shell-47.2/bin/gnome-shell        201MiB |
+-----------------------------------------------------------------------------------------+

[das@t:~]$ uname -a
Linux t 6.12.7 #1-NixOS SMP PREEMPT_DYNAMIC Fri Dec 27 13:02:20 UTC 2024 x86_64 GNU/Linux

Yep. The current driver works fine even on wlroots. Couple things about your config:

  • Just like @JamesSunderland 's your nvidia-vaapi-driver package is put in the wrong option.
  • Manually setting the modeset and fbdev kernel args technically works, but I’d really recommend you use the nixpkgs module so that it sorts the quirks out for you instead as time goes on.
  • Don’t use _latest kernels with nvidia, you’ll run into all kinds of issues. Hell, don’t use any non-default kernels. You too @randomizedcoder.

Note that the majority of all of this only properly works on Turing and newer (i.e., ~RTX2xxx+).

These are the “base” settings on top of which to apply the stuff from my earlier comment:

services.xserver.videoDrivers = [ "nvidia" ];

hardware.nvidia = {
  # This will no longer be necessary when
  # https://github.com/NixOS/nixpkgs/pull/326369 hits stable
  modesetting.enable = lib.mkDefault true;
  # Power management is nearly always required to get nvidia GPUs to
  # behave on suspend, due to firmware bugs.
  powerManagement.enable = true;
  # The open driver is recommended by nvidia now, see
  # https://download.nvidia.com/XFree86/Linux-x86_64/565.77/README/kernel_open.html
  open = true;
};

Double check that all these are set, and that you’re using the right options. Don’t set any other options.

If it doesn’t work after that + a reboot, share the output of the vainfo command too; it’s possible there’s been a regression since last Saturday, I suppose.

2 Likes

Wow, thanks so much, got it working on my 3080Ti, even with xanmod_latest!
Config: config.nix · GitHub

I fixed the placement of the vaapi package and cleaned the config a bit (only my second day running nixos on my main computer, so please excuse the mess :sweat_smile:)

More importantly I think your ‘dont set any other options’ advice prompted me to recheck my firefox config, I tried creating a new profile (my current one is like 10 years old and travelled across 10 different OSes) and in the virgin profile hardware accel worked straight away! So I basically nuked all the settings with ‘media’, ‘gfx’, ‘accel’ and the likes in it, and now it works! Thanks so much, this really was a long, painful journey for me as I still (after close to half a year of tinkering) dont have it fixed on my media pc, but that one isn’t that important to me, so I’m golden for the time being.

You still should not use it; or at the very least be aware thet any update may irreversibly break stuff. Because the driver is out-of-tree, it cannot update in lockstep with the kernel, so it happens pretty frequently that the kernel updates and for a few weeks to a month or two the nvidia driver fails to build. Since nixpkgs doesn’t package EOL kernels, this can leave you stuck.

Sticking to LTS kernels means that you don’t run into this problem, since nvidia can target LTS releases, and nixpkgs keeps them around until they reach EOL. Dropping the _latest really will improve your experience a lot, a large part of why people struggle with nvidia is probably just that.

1 Like