Troubleshooting AMD RX 7900 XT GPU Passthrough on Proxmox VE with NixOS Guest: Firmware Loading Issues

Hey everyone,

I’m encountering an issue while setting up a NixOS guest on a Proxmox VE host with an RX 7900 XT GPU passed through. The problem appears to be related to firmware loading during the GPU initialization.

When running sudo dmesg | grep amdgpu, I get the following output:

[ 5.505230] amdgpu: Virtual CRAT table created for CPU
[ 5.505261] amdgpu: Topology: Add CPU node
[ 5.516123] amdgpu 0000:01:00.0: amdgpu: initializing kernel modesetting (IP DISCOVERY 0x1002:0x744C 0x1002:0x1002 0xCC).
[ 5.516143] amdgpu 0000:01:00.0: amdgpu: register mmio base: 0x81100000
[ 5.516146] amdgpu 0000:01:00.0: amdgpu: register mmio size: 1048576
[ 5.521766] amdgpu 0000:01:00.0: amdgpu: detected ip block number 0 <common_v1_0_0> (soc21_common)
[ 5.521771] amdgpu 0000:01:00.0: amdgpu: detected ip block number 1 <gmc_v11_0_0> (gmc_v11_0)
[ 5.521774] amdgpu 0000:01:00.0: amdgpu: detected ip block number 2 <ih_v6_0_0> (ih_v6_0)
[ 5.521777] amdgpu 0000:01:00.0: amdgpu: detected ip block number 3 <psp_v13_0_0> (psp)
[ 5.521779] amdgpu 0000:01:00.0: amdgpu: detected ip block number 4 <smu_v13_0_0> (smu)
[ 5.521783] amdgpu 0000:01:00.0: amdgpu: detected ip block number 5 <dce_v1_0_0> (dm)
[ 5.521786] amdgpu 0000:01:00.0: amdgpu: detected ip block number 6 <gfx_v11_0_0> (gfx_v11_0)
[ 5.521788] amdgpu 0000:01:00.0: amdgpu: detected ip block number 7 <sdma_v6_0_0> (sdma_v6_0)
[ 5.521791] amdgpu 0000:01:00.0: amdgpu: detected ip block number 8 <vcn_v4_0_0> (vcn_v4_0)
[ 5.521794] amdgpu 0000:01:00.0: amdgpu: detected ip block number 9 <jpeg_v4_0_0> (jpeg_v4_0)
[ 5.521797] amdgpu 0000:01:00.0: amdgpu: detected ip block number 10 <mes_v11_0_0> (mes_v11_0)
[ 5.521821] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from VFCT
[ 5.521825] amdgpu: ATOM BIOS: 113-D70401-00
[ 5.522148] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/psp_13_0_0_sos.bin failed with error -2
[ 5.522153] amdgpu 0000:01:00.0: amdgpu: early_init of IP block failed -19
[ 5.522188] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/smu_13_0_0.bin failed with error -2
[ 5.522191] amdgpu 0000:01:00.0: amdgpu: early_init of IP block failed -19
[ 5.522221] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/dcn_3_2_0_dmcub.bin failed with error -2
[ 5.522224] amdgpu 0000:01:00.0: amdgpu: early_init of IP block failed -19
[ 5.522254] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/gc_11_0_0_pfp.bin failed with error -2
[ 5.522257] amdgpu 0000:01:00.0: amdgpu: early_init of IP block <gfx_v11_0> failed -19
[ 5.522290] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/sdma_6_0_0.bin failed with error -2
[ 5.522294] amdgpu 0000:01:00.0: amdgpu: early_init of IP block <sdma_v6_0> failed -19
[ 5.522324] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/vcn_4_0_0.bin failed with error -2
[ 5.522327] amdgpu 0000:01:00.0: amdgpu: early_init of IP block <vcn_v4_0> failed -19
[ 5.522357] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/gc_11_0_0_mes_2.bin failed with error -2
[ 5.522360] amdgpu 0000:01:00.0: amdgpu: try to fall back to gc_11_0_0_mes.bin
[ 5.522385] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/gc_11_0_0_mes.bin failed with error -2
[ 5.522387] amdgpu 0000:01:00.0: amdgpu: early_init of IP block <mes_v11_0> failed -19
[ 5.522394] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
[ 5.522400] amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.

As you can see, firmware files such as amdgpu/psp_13_0_0_sos.bin fail to load with error -2.

I have already installed the linux-firmware package, so I’m wondering if there might be an issue with how the NixOS is constructed, causing the firmware not to be located or loaded correctly.

Any suggestions or insights would be greatly appreciated.

Supplementary Information:

  • Kernel: 6.18.16
  • Nix Channels: unstable