Crashing on AMD iGPU (ring gfx_0.1.0 timeout)

I have a hybrid graphics laptop (Lenovo 16ARX8) that works perfectly when in dedicated graphics mode. When I try to switch to hybrid graphics my desktop will crash after a few minutes with the error:

niri[2643]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
kernel: amdgpu 0000:05:00.0: amdgpu: [drm] *ERROR* Failed to initialize parser -125!

Similar issues I can find online all relate to ring gfx_* timeout but none of their solutions have helped so far.

Journal
kernel: amdgpu 0000:05:00.0: amdgpu: Dumping IP State
kernel: amdgpu 0000:05:00.0: amdgpu: Dumping IP State Completed
kernel: amdgpu 0000:05:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
kernel: amdgpu 0000:05:00.0: amdgpu: [drm] Check your /sys/class/drm/card1/device/devcoredump/data
kernel: amdgpu 0000:05:00.0: amdgpu: ring gfx_0.1.0 timeout, signaled seq=19887, emitted seq=19888
kernel: amdgpu 0000:05:00.0: amdgpu:  Process niri pid 2643 thread niri:cs0 pid 2652
kernel: amdgpu 0000:05:00.0: amdgpu: Starting gfx_0.1.0 ring reset
kernel: amdgpu 0000:05:00.0: amdgpu: Ring gfx_0.1.0 reset failed
kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset begin!. Source:  1
kernel: amdgpu 0000:05:00.0: amdgpu: MODE2 reset
kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset succeeded, trying to resume
kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F47FC00000).
kernel: amdgpu 0000:05:00.0: amdgpu: PSP is resuming...
kernel: amdgpu 0000:05:00.0: amdgpu: reserve 0xa00000 from 0xf47e000000 for PSP TMR
kernel: amdgpu 0000:05:00.0: amdgpu: RAS: optional ras ta ucode is not available
kernel: amdgpu 0000:05:00.0: amdgpu: RAP: optional rap ta ucode is not available
kernel: amdgpu 0000:05:00.0: amdgpu: SECUREDISPLAY: optional securedisplay ta ucode is not available
kernel: amdgpu 0000:05:00.0: amdgpu: SMU is resuming...
kernel: amdgpu 0000:05:00.0: amdgpu: SMU is resumed successfully!
kernel: amdgpu 0000:05:00.0: amdgpu: kiq ring mec 2 pipe 1 q 0
kernel: amdgpu 0000:05:00.0: amdgpu: [drm] DMUB hardware initialized: version=0x05002C00
kernel: amdgpu 0000:05:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
kernel: amdgpu 0000:05:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
kernel: amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
kernel: amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
kernel: amdgpu 0000:05:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset(1) succeeded!
kernel: amdgpu 0000:05:00.0: [drm] device wedged, but recovered through reset
kernel: amdgpu 0000:05:00.0: amdgpu: [drm] *ERROR* Failed to initialize parser -125!

My config for hybrid graphics is the following:

# 16ARX8.nix
{ config, pkgs, lib, ... }:
{
  services.xserver.dpi = 189;

  boot.kernelModules = ["amdgpu"];
  services.xserver.videoDrivers = [
    "amdgpu"
    "nvidia"
  ];

  hardware = {
    graphics.enable = true;

    amdgpu.initrd.enable = false;

    nvidia = {
      package = config.boot.kernelPackages.nvidiaPackages.latest;
      open = false;
      modesetting.enable = true;
      nvidiaSettings = true;

      powerManagement.enable = true;
      powerManagement.finegrained = true;

      prime = {
        amdgpuBusId = "PCI:5:0:0";
        nvidiaBusId = "PCI:1:0:0"; 

        offload = {
          enable = true;
          enableOffloadCmd = true;
        };
      };
    };
  };
}
  • Sometimes it crashes after a few seconds, sometimes minutes. Noting I do seems to affect how long it takes to crash
  • Sometimes it crashes GDM before I can even log in
  • I have tried using nixos-hardware as a reference but I can’t find any fix there
  • I have tried both 6.12 and 6.18 kernel packages
  • amdgpu.ppfeaturemask=0xfff73fff or any other feature masks do not help
  • I have no overclocking and have tried resetting and disabling overclocking in BIOS
  • I have updated the BIOS
  • enabling / disabling TLP does not help
~ $ nix-info -m
 - system: `"x86_64-linux"`
 - host os: `Linux 6.18.2, NixOS, 26.05 (Yarara), 26.05.20251221.a653104`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.31.2+1`
 - channels(root): `"nixos-25.11"`
 - nixpkgs: `/nix/store/1ny7brxnqbx6xilj7mjdlinzpb5a1s3i-source`

I had no issues when I was using arch a few months ago so I know it worked at one point.

Really not sure what to try next.

It’s caused by linux-firmware regression and tracked in linux-firmware: amdgpu regression causing freezes, fix available upstream · Issue #466945 · NixOS/nixpkgs · GitHub

1 Like

Thank you for the quick solution!

Using the older linux-firmware package specified in this comment does in fact solve my issue until the fix is merged here [Backport release-25.11] linux-firmware: 20251125 -> unstable-2025-12-18 by nixpkgs-ci[bot] · Pull Request #472939 · NixOS/nixpkgs · GitHub

1 Like