Kernel anomalies

Agreed, started a new thread since i am curious of your thoughts.

im compiling a slightly modded version of the zen1 6.8.9 kernel from arch, needed it for media features and smoother streaming.
Though i did actually try compiling just production/latest drivers on a variety of kernels before settling on this one i have now.
Including the default (problems with 560), default had tons of latency issues for my workload so i needed a more optimized kernel. I use nix as my daily machine.

Ive been wanting to break into kernel v6 and tinker with the rust stuff a bit but ive never been able to get anything past v6.8.9 to compile with the production drivers. I still had problems with the prod drivers too, i did actually pin the GPU drivers.

hardware.nvidia.package = let
rcu_patch = pkgs.fetchpatch {
  url = "https://github.com/gentoo/gentoo/raw/c64caf53/x11-drivers/nvidia-drivers/files/nvidia-drivers-470.223.02-gpl-pfn_valid.patch";
  hash = "sha256-eZiQQp2S/asE7MfGvfe6dA/kdCvek9SYa/FFGp24dVg=";
  };
in config.boot.kernelPackages.nvidiaPackages.mkDriver {
    #version = "545.29.06";
    #sha256_64bit = "sha256-grxVZ2rdQ0FsFG5wxiTI3GrxbMBMcjhoDFajDgBFsXs=";
    #sha256_aarch64 = "sha256-o6ZSjM4gHcotFe+nhFTePPlXm0+RFf64dSIDt+RmeeQ=";
    #openSha256 = "sha256-h4CxaU7EYvBYVbbdjiixBhKf096LyatU6/V6CeY9NKE=";
    #settingsSha256 = "sha256-YBaKpRQWSdXG8Usev8s3GYHCPqL8PpJeF6gpa2droWY=";
    #persistencedSha256 = "sha256-AiYrrOgMagIixu3Ss2rePdoL24CKORFvzgZY3jlNbwM=";

    version = "535.154.05";
    sha256_64bit = "sha256-fpUGXKprgt6SYRDxSCemGXLrEsIA6GOinp+0eGbqqJg=";
    sha256_aarch64 = "sha256-G0/GiObf/BZMkzzET8HQjdIcvCSqB1uhsinro2HLK9k=";
    openSha256 = "sha256-wvRdHguGLxS0mR06P5Qi++pDJBCF8pJ8hr4T8O6TJIo=";
    settingsSha256 = "sha256-9wqoDEWY4I7weWW05F4igj1Gj9wjHsREFMztfEmqm10=";
    persistencedSha256 = "sha256-d0Q3Lk80JqkS1B54Mahu2yY/WocOqFFbZVBh+ToGhaE=";

    #version = "550.40.07";
    #sha256_64bit = "sha256-KYk2xye37v7ZW7h+uNJM/u8fNf7KyGTZjiaU03dJpK0=";
    #sha256_aarch64 = "sha256-AV7KgRXYaQGBFl7zuRcfnTGr8rS5n13nGUIe3mJTXb4=";
    #openSha256 = "sha256-mRUTEWVsbjq+psVe+kAT6MjyZuLkG2yRDxCMvDJRL1I=";
    #settingsSha256 = "sha256-c30AQa4g4a1EHmaEu1yc05oqY01y+IusbBuq+P6rMCs=";
    #persistencedSha256 = "sha256-11tLSY8uUIl4X/roNnxf5yS2PQvHvoNjnd2CB67e870=";

    patches = [ rcu_patch ];
 };

boot.kernelModules = with config.boot.kernelModules; [“msr” “kvm-amd” “vfio_pci” “vfio_iommu_type1” “vfio” “nfs”];

boot.kernelPackages = let
linux_demon_pkg = { fetchurl, buildLinux, …}@args:

buildLinux (args // rec{
version = “6.8.9-demon”;
modDirVersion = “6.8.9-zen1”;

  src = pkgs.fetchurl {
    url ="https://github.com/Mephist0phel3s/zen1-6.8.9/archive/v6.8.9-demon.tar.gz";
    sha256 ="sha256-JfDfJbQzDPVth2zwj59jQQgvv58WcvsMtysRB2+wGRg=";
  };
} // (args.argsOverride or { }));

linux_demon = pkgs.callPackage linux_demon_pkg { };
in pkgs.recurseIntoAttrs (pkgs.linuxPackagesFor linux_demon);

Hard to tell what causes compilation issues if you’re not running a vanilla kernel. Nvidia obviously develop only against the upstream kernel, the very nature of what they’re doing make anything else infeasible. Still their fault for not writing a proper mesa driver, but once you assume that a lot of the issues are kind of out of their control.

While I appreciate that you want to use a different kernel, I think the only way to get to the bottom of this is to completely remove any kernel or driver version tweaks and just trying the literal default definitions from nixpkgs (so not the production driver either). The production driver is quite old, the current stable driver (565.77) is definitely the most stable version I’ve run.

Personally I’m running pkgs.linuxKernel.packages.linux_xanmod from stable (since it has some nice tweaks for AMD processors and PREEMPT_RT helps with some audio issues I have), with this specific driver: dotfiles/nixos-modules/nvidia/default.nix at 561931560d2c12e81f139ef8c681e6d99fc6c54e · TLATER/dotfiles · GitHub

Compilation obviously works fine, at runtime I’m seeing no performance hiccups of note, only a bug in firefox that causes webtrc to crash, but I don’t think nvidia is related.

That said, again, if you do anything besides running the LTS vanilla kernel you’re kind of on your own, and your only valid complaint should be that out-of-tree kernel modules should be prohibited. Linux upstream only doesn’t agree because that’d make it hard to convince some vendors, but at least Google is pushing in that direction…

Yeah your not wrong, i didnt try the xanmod kernel actually, was the only one i didnt actually.
It did actually throw me a few errors about the nvidia kernel modules in particular at compile time. but running with ignore errors flag showed the rest of the kernel compiled most of the time.

i like to tinker and see what i can figure out by poking at shit. My original reason for running this kernel was actually for VM stability and optimization as it was recommended by arch forums for this type of load. Was working on bypassing the anti-cheat in PUBG so i could play in a VM without getting session kicked.
I think im going to give your kernel a whirl with 565.77 later today and see what happens.
If it makes a difference, the zen1 kernel is actually an available option in nixpkg repo. I wrote a patch for a guy and just forked the kernel all together during the process and started using the patched version myself.
You build with open flags on nvidia or no?

It’s all in there: dotfiles/nixos-modules/nvidia/default.nix at 561931560d2c12e81f139ef8c681e6d99fc6c54e · TLATER/dotfiles · GitHub

That whole directory is the condensed knowledge I have about how one gets the nvidia driver to run on NixOS :wink:

Went to go read that, got sidetracked like crazy and then my wife walked in and was like “i want a computer” and i was like, well. Alright?
She be like “A nice one…”

So TLDR im giving her my 3080 FE and just ordered a reference edition Radeon 6700 XT because im over NVIDIA, perfect opportunity to kill 2 birds with one stone.

Sooo, problem solved i guess?
Sorry it took so long to reply back, was ordering parts for her new build lol

2 Likes

Even though this is basically non issue now, i figured out why the new drivers wouldnt build and its because im stupid and didnt read the stack trace thoroughly enough.

removed RCU patch from call and it built fine.

1 Like