Possibly graphical problems with upgrading from 24.11 to 25.05

So far i have been using NixOS 24.11 without any issues, i wanted to upgrade to 25.05 but encountered critical issues.

After using the system for a seemingly random amount of time (usually ~1-5 minutes), my whole desktop environment (and most likely system) excluding the cursor (i can still move it) permanently freezes,

Im using plasma 5 with X11, on an AMD Radeon GPU ( RX 6800 ) with “amdgpu” drivers.

To upgrade i did:

sudo nix-channel --add https://nixos.org/channels/nixos-25.05 nixos
sudo nix-channel --update
| edited my system flake so that nixpkgs points to 25.05
sudo nix flake update
sudo nixos-rebuild switch --flake /etc/nixos/#myNixos

Nothing else in my configuration changed from 24.11 to 25.05.

What i tried to fix the issue is:

  • change the gpu driver to “modesetting”
  • try different desktop environments (xfce, cinamon still, i want to keep using plasma)
  • disable SDDM

I am suspecting that the problem may stem from changing the linux kernel
the one i use currently and works without issues is 6.6.75,
i also tried 6.6.92 and the default in 25.05 6.12.31
neither worked as 6.6.75 did.

I am very lost and would appreciate help, i still did not wrap my head around the whole of NixOS and there is a real possibility that i have a big misunderstanding, i did search the discourse and reddit to see if anyone had similiar problems, but anything even close happened specifically with nvidia drivers so i did not think its applicable to my situation.

I suspect my whole system freezes because while the DE freezes, i still would be able to do stuff, like reboot my system
(tried CTRL + ALT + T then $ reboot did not seem to work)

Again, thank you for any interest in this problem of mine

1 Like

Does the error also occur with Plasma version 6?

I’ve had similar issues (keyboard being basically gone, mouse still working) on X11 when I restarted dbus.service, since that broke systemd-logind and by extension libinput or something? I don’t recall the details (I just remember being pissed at systemd).

Can you check the logs and see if something happened to dbus?

# boot -1 to get the previous boot, so if it freezes and you restart your system that'd be the right one
journalctl --boot -1 --unit dbus.service

The DE freezing when dbus dies would also kinda check out (sadly).

(this is an absolutely out of the blue guess, no guarantees)

Unfortunately yes, this time i did not even get the time to finish typing in the password before it froze

checked the logs, only messages possibly out of the ordinary are about unknown usernames in message bus configuration file “nm-openconnect” and “pulse”

This is a kernel bug; the amdgpu drivers have been borked for a while.

At least most likely, can you share the complete logs of the previous boot (i.e. skip specifying the unit for journalctl)?

So yep, this is the solution. I’m surprised you’re seeing issues with 6.6.92, though.

2 Likes

checked the journal, indeed many errors and warning at the very last moment, sample:

amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data

"kwin_wayland_drm: Pageflip timed out! This is a bug in the amdgpu kernel driver\nkwin_wayland_drm: Please report this at https://gitlab.freedesktop.org/drm/amd/-/issues

kwin_wayland_drm: Please report this at https://gitlab.freedesktop.org/drm/amd/-/issues\nkwin_wayland_drm: With the output of 'sudo dmesg' and 'journalctl --user-unit plasma-kwin_wayland --boot 0'\n

amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000

amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!

i don’t think the rest is relevant but if you still want it im willing to publish the whole journal

Yep, fair enough. See here: Making sure you're not a bot!

Next set of kernel releases should fix it, if you’re feeling adventurous you can also apply this patch downstream: Making sure you're not a bot!

Still surprised it made it into an LTS kernel; I’ve been seeing these issues too, but only when on gmeet calls, my WM doesn’t seem to tax the GPU as much as plasma, so probably just hits the issue less often.