Random freeze of system

I run issue with my system, it freezes randomly and I thought that it is issue of nvidia, I have RTX4070 and 550 drivers, for kernel parameters I put that

 nvidia_drm.fbdev=1 ibt=off quiet splash apm=power_off acpi=force loglevel=4 nvidia-drm.modeset=1 

but it did not help. here is my journalctl -b -1 http://0x0.st/XXO8.txt. In order to restart my system I use to hard reboot by pressing power on/off button.

Does this also happen with a free Nvidia driver stack? NVK isn’t quite there yet in terms of performance but it should at least work.

If you can’t reproduce it on a free stack, I’m afraid there is not much we can do for your here; you’d have to ask Nvidia for help and I doubt they’d help you.

I used to get this frequently every time I nixos-rebuild switch my system under Gnome Wayland (sometimes randomly too, but not frequent enough to be annoying). I’ve since switched back to X11 and this hasn’t happened again, yet.

I’m also using the Nvidia drivers 550.

1 Like

It happens usually during

nixos-rebuild switch 

or on logout-login or if session is locked, but just random freezes occur too.

before Nixos Gnome and Wayland I used Arch with X11 but I had the same issue

Did you also use Gnome on Arch? Because this might be an issue with gdm. If that’s the case, can you try another DM like LightDM?

no, on Arch I used Lemur or Ly

In that case, I doubt we can do anything about it.

This discussion thread on Nvidia’s forum might be relevant: Series 550 freezes laptop - Linux - NVIDIA Developer Forums

1 Like

Thank you for link, I think to roll back to 535, as before it worked without issues. Therefore the question on Wayland, I read that Wayland is not supported on 100% with 535 driver, are there any hooks to make it work?

Regarding roll back, is it the right way to do that on Nixos:

# in nvidia.nix
hardware.nvidia.package = {
 config.boot.kernelPackages.nvidiaPackages.mkDriver {
    version = "535.154.05";
    sha256_64bit = "sha256-fpUGXKprgt6SYRDxSCemGXLrEsIA6GOinp+0eGbqqJg=";
    sha256_aarch64 = "sha256-G0/GiObf/BZMkzzET8HQjdIcvCSqB1uhsinro2HLK9k=";
    openSha256 = "sha256-wvRdHguGLxS0mR06P5Qi++pDJBCF8pJ8hr4T8O6TJIo=";
    settingsSha256 = "sha256-9wqoDEWY4I7weWW05F4igj1Gj9wjHsREFMztfEmqm10=";
    persistencedSha256 = "sha256-d0Q3Lk80JqkS1B54Mahu2yY/WocOqFFbZVBh+ToGhaE=";
 };

It is example from Nixos wiki

P.S. on Arch I used 535xx-dkms and 535xx-utils, and it was great. Is there the way to have dkms version of drivers?

1 Like

Yes.

I don’t think that the -dkms distinction matters on NixOS. DKMS stands for dynamic kernel module support, which means that it automatically rebuilds the Nvidia driver whenever a different version of the kernel is installed. On NixOS, Nix already handles the rebuild.

I’m not aware of any way to make Wayland work better with Nvidia’s driver, so I’m not commenting on that part. I mostly use Plasma X11 on my Nvidia machine to avoid some glitches.

1 Like

Ok, thank you for all these advices.
I will try it, else I know about some patches done on 545, but I remember that with 545 I could not even login at least with X11 server.

Where can I find all nvidia drivers disponible for nixos?

Thank you again

Unfortunately, my system just froze today, but this time after waking up from sleep. I’m gonna try enabling hardware.nvidia.powerManagement.finegrained as the docs suggest and see if this helps.

Thanks for the link. This is probably the root issue here so I’ll try rolling back the driver if this happens again (which it probably will :sweat_smile:).

For NixOS, the nvidia drivers are listed here.

I tried to downgrade my drivers for nvidia: here is my nvidia.nix

{ config, lib, pkg, ... }:
{
  # Enable OpenGL
  hardware.opengl = {
    enable = true;
    driSupport = true;
    driSupport32Bit = true;
  };

  # Load Nvidia driver
  services.xserver.videoDrivers = [ "nvidia" ];

  hardware.nvidia = {
    modesetting.enable = true;
    powerManagement.enable = false;
    powerManagement.finegrained = false;
    open = false;
    nvidiaSettings = true;
    package = {
      config.boot.kernelPackages.nvidiaPackages.mkDriver = {
	version = "535.154.05";
	sha256_64bit = "sha256-fpUGXKprgt6SYRDxSCemGXLrEsIA6GOinp+0eGbqqJg=";
	sha256_aarch64 = "sha256-G0/GiObf/BZMkzzET8HQjdIcvCSqB1uhsinro2HLK9k=";
	openSha256 = "sha256-wvRdHguGLxS0mR06P5Qi++pDJBCF8pJ8hr4T8O6TJIo=";
	settingsSha256 = "sha256-9wqoDEWY4I7weWW05F4igj1Gj9wjHsREFMztfEmqm10=";
	persistencedSha256 = "sha256-d0Q3Lk80JqkS1B54Mahu2yY/WocOqFFbZVBh+ToGhaE=";
      };
    };
  };

  # Kernel parameters
  boot.kernelParams = [
    "nvidia_drm.fbdev=1"
    "ibt=off"
  ];

}

but I got this error

error: attribute 'useProfiles' missing

       at /nix/store/801l7gvdz7yaibhjsxqx82sc7zkakjbq-source/nixos/modules/hardware/video/nvidia.nix:453:62:

          452|         environment.etc = {
          453|           "nvidia/nvidia-application-profiles-rc" = lib.mkIf nvidia_x11.useProfiles {source = "${nvidia_x11.bin}/share/nvidia/nvidia-application-profiles-rc";};
             |                                                              ^
          454|

What am I doing wrong?

This is the correct way to set the package:

    package = config.boot.kernelPackages.nvidiaPackages.mkDriver {
      version = "535.154.05";
      sha256_64bit = "sha256-fpUGXKprgt6SYRDxSCemGXLrEsIA6GOinp+0eGbqqJg=";
      sha256_aarch64 = "sha256-G0/GiObf/BZMkzzET8HQjdIcvCSqB1uhsinro2HLK9k=";
      openSha256 = "sha256-wvRdHguGLxS0mR06P5Qi++pDJBCF8pJ8hr4T8O6TJIo=";
      settingsSha256 = "sha256-9wqoDEWY4I7weWW05F4igj1Gj9wjHsREFMztfEmqm10=";
      persistencedSha256 = "sha256-d0Q3Lk80JqkS1B54Mahu2yY/WocOqFFbZVBh+ToGhaE=";
    };

Oh, I’m sorry that I didn’t pay attention to the details and replied “yes” previously in this thread.

This version is wrong in that there’s an excessive pair of brackets:

- hardware.nvidia.package = {
+ hardware.nvidia.package =
    config.boot.kernelPackages.nvidiaPackages.mkDriver {
      # ...
    }
- };
+ ;

This version is wrong in that mkDriver is a function to be called, so there shouldn’t be an = after mkDriver. @eljamm provided a corrected version.

1 Like

thank you all for help!
So with drivers 535 it works great, even my problem with power off is solved. unfortunately I can not use Wayland session tho. hope Nvidia will make something usable one day.

did someone try alternative drivers to Nvidia proprietary? is this worth it?

1 Like

I can’t really speak from experience since I’ve only used the proprietary drivers, but you can try using Nouveau/NVK. Keep in mind that the performance might not be that great in comparison (at least at the moment). I see no harm in trying them out, though.

there is no much information on nouveau on Nixos, how to switch to nouveau ? what I should change in my conf?

If you’re not blacklisting the nouveau driver module and you don’t have the proprietary drivers installed then it’s enabled by default by mesa.

So in essence, just remove/undo the steps you made in the Nvidia docs. You might want to keep OpenGL enabled, though:

  # Enable OpenGL
  hardware.opengl = {
    enable = true;
    driSupport = true;
    driSupport32Bit = true;
  };

To check which driver the card is currently using, run:

$ lspci -nnk | grep -A3 -e VGA
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA106M [GeForce RTX 3060 Mobile / Max-Q] [10de:2560] (rev a1)
        Subsystem: Lenovo Device [17aa:3afe]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

For me you can see it’s the proprietary nvidia drivers.