Amdgpu kernel module failing during load

Hey all. I’ve been refactoring my configuration to make it easier to define new hosts. Today I tried it for the first time and I found that the amdgpu module isn’t getting loaded. Here’s the relevant kernel logs:

[    2.711936] [drm] amdgpu kernel modesetting enabled.
[    2.711957] amdgpu: vga_switcheroo: detected switching method \_SB_.PCI0.GP17.VGA_.ATPX handle
[    2.712316] amdgpu: ATPX version 1, functions 0x00000201
[    2.712350] amdgpu: ATPX Hybrid Graphics
[    2.714042] amdgpu: Virtual CRAT table created for CPU
[    2.714061] amdgpu: Topology: Add CPU node
[    2.714179] amdgpu 0000:07:00.0: enabling device (0006 -> 0007)
[    2.714234] [drm] initializing kernel modesetting (YELLOW_CARP 0x1002:0x1681 0x1043:0x1A5C 0xC7).
[    2.714242] [drm] register mmio base: 0xFC500000
[    2.714243] [drm] register mmio size: 524288
[    2.717620] [drm] add ip block number 0 <nv_common>
[    2.717622] [drm] add ip block number 1 <gmc_v10_0>
[    2.717624] [drm] add ip block number 2 <navi10_ih>
[    2.717625] [drm] add ip block number 3 <psp>
[    2.717626] [drm] add ip block number 4 <smu>
[    2.717627] [drm] add ip block number 5 <dm>
[    2.717629] [drm] add ip block number 6 <gfx_v10_0>
[    2.717630] [drm] add ip block number 7 <sdma_v5_2>
[    2.717631] [drm] add ip block number 8 <vcn_v3_0>
[    2.717633] [drm] add ip block number 9 <jpeg_v3_0>
[    2.717655] amdgpu 0000:07:00.0: amdgpu: Fetched VBIOS from VFCT
[    2.717657] amdgpu: ATOM BIOS: 113-REMBRANDT-X35
[    2.717701] amdgpu 0000:07:00.0: Direct firmware load for amdgpu/yellow_carp_toc.bin failed with error -2
[    2.717704] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <psp> failed -19
[    2.717964] amdgpu 0000:07:00.0: Direct firmware load for amdgpu/yellow_carp_dmcub.bin failed with error -2
[    2.717966] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <dm> failed -19
[    2.718176] amdgpu 0000:07:00.0: Direct firmware load for amdgpu/yellow_carp_pfp.bin failed with error -2
[    2.718178] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <gfx_v10_0> failed -19
[    2.718386] amdgpu 0000:07:00.0: Direct firmware load for amdgpu/yellow_carp_sdma.bin failed with error -2
[    2.718389] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <sdma_v5_2> failed -19
[    2.718608] amdgpu 0000:07:00.0: Direct firmware load for amdgpu/yellow_carp_vcn.bin failed with error -2
[    2.718610] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <vcn_v3_0> failed -19
[    2.718796] amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
[    2.718798] amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.

If error -2 means ENOENT, that would mean the amdgpu module isn’t finding some firmware files. I would suspect that something is terribly wrong with my config and setting boot.intird.kernelModules = [ "amdgpu" ] is configuring the wrong kernel instance or something.

The reason I suspect it’s the configuration is because I’m setting a custom instance of pkgs, adding an overlay for unstable, and setting boot.kernelPackages = pkgs.linuxKernel_latest.

The working configuration is in this folder, just a configuration.nix and hardware.nix

My rewrite is a lot more verbose, with modules, library functions, etc. The most relevant stuff would be in modules/default.nix here, which sets the kernel packages:

boot = {
      # Prefer the latest kernel
      kernelPackages = mkDefault pkgs.linuxPackages_latest;
      loader = {
        efi.canTouchEfiVariables = mkDefault true;
        systemd-boot.enable = mkDefault true;
        systemd-boot.configurationLimit = mkDefault 10;
      };
    };

lib/hosts.nix here, which configures the top level nixosSystem invocation:

{
  vicos,
  pkgsDefaults,
}:
{
  mkHost =
    {
      name,
      system,
      configuration,
    }:
    let
      inherit (vicos.inputs) nixpkgs nixpkgs-unstable home-manager;
      pkgs =
        import nixpkgs {
          inherit system;
          overlays = [
            (self: super: {
              unstable = import nixpkgs-unstable { inherit system; } // pkgsDefaults;
            })
          ];
        }
        // pkgsDefaults;
    in
    nixpkgs.lib.nixosSystem {
      modules = [
        nixpkgs.nixosModules.readOnlyPkgs
        home-manager.nixosModules.home-manager
        {
          nixpkgs.pkgs = pkgs;
          networking.hostName = name;
        }
        (import ../modules)
        {
          vicos.flake = {
            inherit (vicos) lib path inputs;
            inherit system;
          };
        }
        configuration
      ] ++ (vicos.lib.walkModules ../modules);
    };
}

and hosts/thunkbox here which is the config I’m testing.

Any help would be appreciated, as well as some sources or documentation that would be relevant!

It just looks like you don’t have linux-firmware installed. This is normally done as part of the hardware-configuration.nix file that nixos-generate-config creates.

What does nixos-option hardware.enableRedistributableFirmware say?

Thank you! That was it. I had looked it up on nixOS options and I mistakenly thought the default was true. I think I’ll have to double check if I passed all the appropiate options from my generated hardware.nix for next time!