On my EliteBook 855 G7 laptop, I feel like the graphics drivers have become less and less stable in the last couple of months. I think it was during the upgrade to 23.05 that I started seeing errors like this
jul 19 11:18:10 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN firmware
jul 19 11:18:10 nixhpix kernel: [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
jul 19 11:18:10 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
jul 19 11:18:10 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
jul 19 11:18:10 nixhpix kernel: [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
jul 19 11:18:10 nixhpix kernel: [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
jul 19 11:18:10 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: Secure display: Generic Failure.
jul 19 11:18:10 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
jul 19 11:18:10 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: SMU is initialized successfully!
And now when I last did nix flake update
, brave-browser cannot render images properly and is unusable.
I have also gotten quite a few kernel crashes. I managed to find the logs for one of them.
jul 19 10:48:45 nixhpix systemd[1]: Finished Permit User Sessions.
jul 19 10:48:45 nixhpix systemd[1]: Starting X11 Server...
jul 19 10:48:45 nixhpix kernel: [drm] kiq ring mec 2 pipe 1 q 0
jul 19 10:48:45 nixhpix kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
jul 19 10:48:45 nixhpix kernel: [drm] JPEG decode initialized successfully.
jul 19 10:48:45 nixhpix kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
jul 19 10:48:45 nixhpix kernel: amdgpu: sdma_bitmap: 3
jul 19 10:48:45 nixhpix kernel: amdgpu: SRAT table not found
jul 19 10:48:45 nixhpix kernel: amdgpu: Virtual CRAT table created for GPU
jul 19 10:48:45 nixhpix kernel: amdgpu: Topology: Add dGPU node [0x1636:0x1002]
jul 19 10:48:45 nixhpix kernel: kfd kfd: amdgpu: added device 1002:1636
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 6
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_low uses VM inv eng 1 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_high uses VM inv eng 4 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 5 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 6 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 7 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 8 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 9 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 10 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 11 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 12 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 13 on hub 0
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
jul 19 10:48:45 nixhpix kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
jul 19 10:48:45 nixhpix kernel: [drm] Initialized amdgpu 3.52.0 20150101 for 0000:03:00.0 on minor 0
jul 19 10:48:45 nixhpix kernel: BUG: kernel NULL pointer dereference, address: 0000000000000012
jul 19 10:48:45 nixhpix kernel: #PF: supervisor read access in kernel mode
I have tried to add the nixos-hardware repository and tried the profile for the hp-elitebook-845g9 laptop, which should be similar in hardware to my machine. It did add some additional module that errors out on boot, so boot takes a bit longer, but the graphics issues on brave are still there.
Additional info that might be important. I am running ZFS, so I have set boot.kernelPackages = mkForce config.boot.zfs.package.latestCompatibleLinuxPackages