Hi All,
After updating from NixOS 25.05.20250612.fd48718 (built 2025-06-12) to 25.05.20250613.5f4f306 (built 2025-06-15) my AMD GPU driver breaks and my two monitors remain powered off (there haven’t been any configuration changes during this time). Everything works fine if I boot from the earlier generation. All updates since then exhibit the same broken behaviour.
Kernel/Hardware summary:
Kernel: Linux 6.12.33
Display (AUS28CA): 3840x2160 @ 60 Hz (as 2560x1440) in 28" [External]
Display (VZ27A): 2560x1440 @ 60 Hz in 27" [External]
CPU: AMD Ryzen 9 7950X (32) @ 5.88 GHz
GPU 1: AMD Raphael [Integrated]
GPU 2: NVIDIA GeForce RTX 4060 Ti [Discrete]
(I use the NVIDIA GPU for LLMs)
amdgpu related entries from the system log:
Jun 17 19:41:12 akgd kernel: [drm] amdgpu kernel modesetting enabled.
Jun 17 19:41:12 akgd kernel: amdgpu: vga_switcheroo: detected switching method \_SB_.PCI0.GP17.VGA_.ATPX handle
Jun 17 19:41:12 akgd kernel: amdgpu: ATPX version 1, functions 0x00000000
Jun 17 19:41:12 akgd kernel: amdgpu: Virtual CRAT table created for CPU
Jun 17 19:41:12 akgd kernel: amdgpu: Topology: Add CPU node
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: enabling device (0006 -> 0007)
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: Fetched VBIOS from VFCT
Jun 17 19:41:12 akgd kernel: amdgpu: ATOM BIOS: 102-RAPHAEL-008
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: vgaarb: deactivate vga console
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
Jun 17 19:41:12 akgd kernel: [drm] amdgpu: 512M of VRAM memory ready
Jun 17 19:41:12 akgd kernel: [drm] amdgpu: 31717M of GTT memory ready.
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: failed to load ucode DMCUB(0x3D)
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0008)
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: RAS: optional ras ta ucode is not available
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: RAP: optional rap ta ucode is not available
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Jun 17 19:41:12 akgd kernel: amdgpu 0000:72:00.0: amdgpu: SMU is initialized successfully!
Jun 17 19:41:13 akgd kernel: amdgpu 0000:72:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Jun 17 19:41:13 akgd kernel: amdgpu 0000:72:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Jun 17 19:41:13 akgd kernel: amdgpu 0000:72:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
The log up until the last 3 lines is the same as a successful boot.
Some kernel parameters:
nix-repl> nixosConfigurations.akgd.config.boot.kernelParams
[
"nohibernate"
"loglevel=4"
"lsm=landlock,yama,bpf"
"nvidia-drm.modeset=1"
"nvidia-drm.fbdev=1"
"nvidia.NVreg_OpenRmEnableUnsupportedGpus=1"
]
nix-repl> nixosConfigurations.akgd.config.boot.kernelPatches
[ ]
nix-repl> nixosConfigurations.akgd.config.boot.kernelModules
[
"kvm-amd"
"bridge"
"macvlan"
"tap"
"tun"
"zfs"
"loop"
"atkbd"
"ctr"
"nvidia_uvm"
"nvidia"
"nvidia_modeset"
"nvidia_drm"
"i2c-dev"
]
Any suggestions?
Thanks.