No `nvidia-smi` and no GPUs to monitor

nixup · November 30, 2024, 9:02am

Hello,

I have a GTX 1060 6GB and I would like to monitor it.
Here’s my config for it:

{ config, pkgs, ... }:

{
  hardware = {
    graphics.enable = true;
    nvidia = {
      open = false; # GPU too old
      nvidiaPersistenced = true;
    };
  };
  environment.systemPackages = with pkgs; [
    nvtopPackages.nvidia
  ];

  nixpkgs.config = {
    cudaSupport = true;
    cudaCapability = [ "6.1" ];
  };
}

I do not have the nvidia-smi binary in my PATH and nvtop shows “No GPU to monitor.”
Am I missing something?

fndov · November 30, 2024, 9:58am

I have the same issue, the fix was using a stable kernel.

boot.loader.kernelPackages = pkgs.linuxPackages;

the cause is probably a certain underlying hack not working with the newer drivers, I didn’t have this issue before 555

nixup · November 30, 2024, 10:37am

I am using 6.6.63, I think it’s stable.

boot.loader.kernelPackages does not exist so I have added boot.kernelPackages = pkgs.linuxPackages to my config and it did not help :<

eljamm · November 30, 2024, 10:46am

According to NVIDIA - NixOS Wiki, you need to set:

services.xserver.videoDrivers = [ "nvidia" ];

nixup · November 30, 2024, 11:15am

I am running a headless server (GPU is for CUDA) and I did not think this would be needed.
After adding this line to my config I get:

# nvidia-smi 
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Rebooting fixed the thing.

Shouldn’t the nvtopPackages.nvidia pull it in by default?

EDIT:
boot.kernelPackages = pkgs.linuxPackages is not needed.

eljamm · November 30, 2024, 11:24am

I am running a headless server (GPU is for CUDA) and I did not think this would be needed.

AFAIK, this is required for enabling the nvidia module, even if no Xserver is running.

Shouldn’t the nvtopPackages.nvidia pull it in by default?

nvtopPackages.nvidia will just install the libraries it needs to work, but it won’t configure the drivers.

nixup · November 30, 2024, 12:52pm

Then I think it could be better labeled, especially if it not 1:1 Xorg related.

Isn’t that imperative though? I thought that nix is just “say what you want, not how you want it”.

Tagging the maintainer to weigh in on this: @gbtb

eljamm · November 30, 2024, 3:18pm

Then I think it could be better labeled, especially if it not 1:1 Xorg related.

It might be better to have a hardware.nvidia.enable option, but there could be a good reason why it’s currently like this as well.

Isn’t that imperative though?

No, you’re still declaratively installing the package and configuring the drivers.

I thought that nix is just “say what you want, not how you want it”.

NixOS modules can automatically install packages for you, but I don’t think it works the other way around. It’s even more difficult in this case because the driver configuration largely depends on which hardware you have, which NixOS can’t guess.

That said, perhaps this can be possible in the future with something like nixos-facter.

gbtb · December 1, 2024, 11:06am

Hello!

Disclaimer: I’m not a GPU driver expert and I have a limited understanding of inner workings of CUDA/nvidia stuff.
Looking into issues from nvtop repo, it seems to me that nvtop was deliberately decoupled from much of an nvidia drivers due to compatibility issues - [Ubuntu < 22.04] apt install nvtop breaks / requires another nvidia driver · Issue #51 · Syllo/nvtop · GitHub .
As I have only AMD GPUs, I can’t reliably test nvidia-related stuff myself. I’d say if you want to improve the package in this regard, feel free to open an issue on Github and tag nixpkgs CUDA team, they likely help you to figure out good solution.