Nvidia-smi "unable to determine device handle" for GPU after Blender crash

So I updated my system and then started up blender with nvidia-offload blender. But when I try to “pan” the view in blender, the application crashes. And after the crash, the nvidia GPU stops working, requiring me to reboot the whole system for nvidia card to work again.

This is the error that shows up with nvidia-smi after blender crashes:

$ nvidia-smi
Unable to determine the device handle for GPU0000:01:00.0: Unknown Error

This is the nvidia-smi output before running blender:

$ nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.58.02              Driver Version: 555.58.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   45C    P8              3W /   35W |       8MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     89709      G   ...-gnome-shell-46.3.1/bin/gnome-shell          1MiB |
+-----------------------------------------------------------------------------------------+

steam games work without issue with the GPU. The error/issue happens only when using the blender application.

The blender package I am using is an overlay from edolstra-github (after following guide here because I wanted to use nvidia CUDA). Blender version 4.2. The following is the content of the flake.nix (Note: I have cutdown the irrelevant parts of the flake.nix for this post):

{
  description = "My first flake!";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";

    blender-bin.url = "github:edolstra/nix-warez?dir=blender";

  };

  outputs = {self, nixpkgs, blender-bin, ... }:
    let
      system = "x86_64-linux";
      lib = nixpkgs.lib;
      pkgs = import nixpkgs { inherit system; };
    in {
      nixosConfigurations = {
        nixos = lib.nixosSystem {
          inherit system;    
          
          modules = [
            # Add the overlay of blender-bin
            # <https://discourse.nixos.org/t/how-to-get-cuda-working-in-blender/5918/21?u=aboode95>
            ({config, pkgs, ...}: { 
              nixpkgs.overlays = [ blender-bin.overlays.default ];
              environment.systemPackages = with pkgs; [ blender_4_2 ];
             })
            
            ./configuration.nix

          ];
        };
      };
    };
}

What could be causing this error? And how can one fix it?~

Thank you in advance!

Edit: I noticed that upon trying to run blender again, it shows EGL Errors:

$ nvidia-offload blender
EGL Error (0x3003): EGL_BAD_ALLOC: EGL failed to allocate resources for the requested operation.
EGL Error (0x3003): EGL_BAD_ALLOC: EGL failed to allocate resources for the requested operation.
EGL Error (0x3003): EGL_BAD_ALLOC: EGL failed to allocate resources for the requested operation.
EGL Error (0x3003): EGL_BAD_ALLOC: EGL failed to allocate resources for the requested operation.

You might find it difficult to get community support for non-free software.

Interestingly, I get the an EGL error as well telling me that it was successful, but blender does not even open, even though it shows up in my dock.

EGL Error (0x3000): EGL_SUCCESS: The last function succeeded without error.
EGL Error (0x3000): EGL_SUCCESS: The last function succeeded without error.
EGL Error (0x3000): EGL_SUCCESS: The last function succeeded without error.
EGL Error (0x3000): EGL_SUCCESS: The last function succeeded without error.

Digging a bit deeper, I found this issue which suggests that this is a Wayland+Nvidia issue.

As a workaround, you can try to open blender in Xwayland with:

WAYLAND_DISPLAY="" blender

With this, blender opens for me and I tested that CUDA works as well.

1 Like

So, since the issue arose after nvidia driver update, what I did was simply move back to a previous driver. I didn’t want to select exactly which driver I install, so I simply switched from “stable” to “production” version.

# changing from:
package = config.boot.kernelPackages.nvidiaPackages.stable;

# to:
package = config.boot.kernelPackages.nvidiaPackages.production;

After nixos-rebuild, this changed my nvidia driver version from 555.58.02 to 550.107.02.

I knew which driver would be installed by looking into this.


Thank you @eljamm for your reply! unfortunately I wasn’t able to check your method before changing my nvidia driver. Though I am sure it would have helped~

1 Like