Should boot.extraModulePackages affect other NixOS generations?

Hello community,

I’m trying to understand what happened after I played with my config. Essentially I am having some network connection losses on a fairly recent mainboard and I suspect the network driver.
My original hardware config was:

boot.kernelPackages = pkgs.linuxPackages_latest;
boot.kernelModules = [ "vfio_pci" "vfio" "vfio_iommu_type1" "kvm-amd"  ];
boot.extraModulePackages =  [ ];
boot.blacklistedKernelModules = [ ];

which essentially worked. I am 99% sure that lsmod did not show the r8125 module at this point.

It was using the r8169 module (which it should by default). Since I have a RTL8125 chip I figured that trying to load the r8125 module could not hurt, so I changed this to:

boot.kernelPackages = pkgs.linuxPackages_latest;
boot.kernelModules = [ "vfio_pci" "vfio" "vfio_iommu_type1" "kvm-amd"  ];
boot.extraModulePackages = with config.boot.kernelPackages; [ r8125 ];
boot.blacklistedKernelModules = [ "r8169" ];

After a generation switch and reboot this did not work at all (it did not recognize the device), so obviously I switched back the previous generation, then reverted my code change, built another generation and rebooted.

To my surprise this did no longer work. Effectively I now had three generations:

  1. before the change ← works, but I can now see the r8128 module, which was (according to my memory not visible before)
  2. with the change ← doesn’t work
  3. with the change reverted ← also does not work. The r8125 module still exists and gets preferred over the r8169 module

In the end, after playing around a bit i needed to make the following change to get it working again:

boot.kernelPackages = pkgs.linuxPackages_latest;
boot.kernelModules = [ "vfio_pci" "vfio" "vfio_iommu_type1" "kvm-amd" 
   "r8169" ];
boot.extraModulePackages = [  ];
boot.blacklistedKernelModules = [ "r8125" ];

i.e. blacklisting the module and making sure that the default gets loaded (while this does not solve my lost carrier problem, it at least restores network connectivity to the original state).

Is this expected when working with kernel modules? I.e. can a change of kernel modules in a later generation affect earlier generations?

Thanks for any insights!

No. If this happened as described and isn’t just a glitch in human memory, this sounds like a pretty severe bug.

Does the kernel somehow statefully remember which modules it loaded last time? I’d love to know if you can reproduce this.

To be honest, maybe it was just a glitch of my memory.

Maybe I did something really stupid and then misremembered or just plain booted the wrong generation. I tried to reproduce it once more but I can’t do it, but maybe I got lost in the generations or had a buggy line of code in the config. I have old GIT revisions of course, but during the action it didn’t commit since it was not working. So I only have a before and an after state… And now I seemingly can’t get into the “inbetween” state again.

As an excuse: Debugging got really complicated once network was lost, since I couldn’t really do much in terms of adding packages, building new generations etc.

I was mainly curious if there is maybe a special thing about kernel modules. I would have also thought that (unless there is a change to some arcane configuration file untouched by NixOS) it should be reproducible.

Thanks a lot!

(I will also close this and mark as solved, since I’m more interested in the network glitch itself)

FYI, nixos-rebuild has an --offline switch which will make nix not complain about missing a network connection.