AMD 9070 XT Linux experiences - early June 2025

I’m not really expecting much help necessarily by posting this. It’s more of a “state of the union” style post to document my own experiences with the AMD 9070 XT GPU as I haven’t seen a whole lot of NixOS 9070 users thus far.

The TLDR synopsis here at the top is “expect instability”. I’ll skip the really long rant here about being an Nvidia user since the TNT 2 days on Linux and the fact that the Nvidia driver stack has always been far more stable than the former fglrx or current amdgpu stack has ever been.

However, given the general insanity of the 12VHPWR connectors on the 50 series GeForce cards, I simply can’t justify putting something like that in my house. Having said that, part of the reason I’m posting this here now is because I ran into a somewhat similarly concerning issue with the 9070 also.

For the NixOS specific side of things, I’m not doing anything odd or crazy on this system:

boot = {
  initrd.kernelModules = [ "amdgpu" ];
  kernelPackages = pkgs.linuxPackages_6_14;
  kernelParams = [
    "amdgpu.ppfeaturemask=0xfffd3fff"
    "split_lock_detect=off"
  ];
};
graphics = {
  enable = true;
}

I think that’s about it other than using i3 via services.displayManager.defaultSession = "xsession" and having Steam installed via programs.steam.enable = true.

The good news is that things are great, at least for a day or so. Games in Steam run beautifully and all my various hardware accelerated video encoding/decoding needs work great. I initially was running this 9070 XT in a system built around an AMD 5800X, but in part due to the instability, I moved that system back to my older 3070 Ti and built a dedicated 9950X3D box. So even without the Jellyfin transcoding load from that older system (my NAS/Jellyfin box), the experience with the newer system is the same. Everything works for about a day, and then eventually, the system crashes.

The crash always seems to start with my mouse input starting to stagger for a few seconds like the system is under extreme load, followed eventually by a full graphical lockup or even the screen going blank and eventually the video card disabling the display port connection to the monitor entirely (as the monitor ends up powering off). This is partly why I’ve settled on the current kernel CLI parameters above as without those, the system seems to crash even faster. Additionally, I’ve had to disable the hardware video acceleration features in Steam, otherwise the UI likes to freeze and stop updating itself when I flip away to another i3 desktop and back to Steam. I can force the UI to start updating again if I open another window on the same desktop causing the Steam window itself to resize. But this is required every single time I flip away and back. Admittedly, this might be specific to i3 (I am using picom for whatever that’s worth).

I should also mention I’m using ZFS everywhere for everything. This has caused me zero problems with my Nvidia cards in the past. But I obviously can’t assume that necessarily for the AMD driver stack.

Now, to be fair, I saw very similar types of problems with a 6800 XT previously which ended up going to my partner’s Windows machine as a result. And even my Lenovo laptop with a 6900 HS seems to like to lock up within roughly the same time frame usually. So it seems like the amdgpu driver stack is questionable all around, regardless of hardware generation to some degree. About the only amdgpu based system I don’t have this particular experience with is my Steam Deck, although it’s usually not running for more than a few hours any given day before I shut it down again.

The concerning part in all of this was that yesterday this happened on my 9950X3D system and after the display blanked and powered down, some of my fans spun up to ludicrous levels in my system like something was trying to massively overheat. I ended up holding down the power button to kill power entirely as there was so much noise and vibration, it sounded like something was about to damage itself. Since I didn’t feel like I had time to pull the case apart, I’m not positive it was the GPU fans trying to rapidly disassemble themselves. It could have just as easily been the CPU or case fans if the CPU itself was the one trying to eat itself alive.

Anyway, I know 6.15 has even more amdgpu fixes which I will try just as soon as ZFS officially supports it. But currently running mesa-25.1.1 and kernel 6.14.9 (as of yesterday when this happened anyway; on 6.14.10 now), this card isn’t what I’d call stable by any means. It will probably be a frustrating experience for folks expecting things to “just work”.

I’d love to hear from other NixOS users, especially if you’re having a nearly 100% stable experience with this GPU to see how things differ potentially. But I’m also assuming this is all just par for the course for the amdgpu driver stack to some extent since I’ve never seen it be a paragon of stability in any of its various incarnations. I am hoping it at least stabilizes to the point where I don’t have to assume my system will crash about once a day or so if I avoid shutting it down.

2 Likes

It might have been the GPU fan. I can get pretty close to what I think I was hearing if I run Quake II RTX on this system:

amdgpu_top v0.10.5
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│GPU Name                                | PCI Bus        |    VRAM Usage    |                                                                         │
│SCLK    MCLK    VDDGFX  Power           | GFX% UMC%Media%|     GTT Usage    |                                                                         │
│Temp    Fan     ECC_UnCorr. Throttle_Status                                 |                                                                         │
│------------------------------------------------------------------------------------------------------------------------------------------------------│
│#0  [AMD Radeon RX 9070 XT    ](gfx1201)| 0000:03:00.0   |  5012/ 16304 MiB |                                                                         │
│2954MHz 1258MHz 1006mV  329/330W        | 100%  33%   0% |   264/ 30943 MiB |                                                                         │
│  66C      0RPM [      N/A] [PPT0]                                          |                                                                         │
│------------------------------------------------------------------------------------------------------------------------------------------------------│
│#1  [AMD Radeon Graphics      ](gfx1036)| 0000:17:00.0   |    24/  2048 MiB |                                                                         │
│ 600MHz 3000MHz 1035mV   52/___W        |   0% ___%   0% |    15/ 30943 MiB |                                                                         │
│  49C   ____RPM [      N/A] []                                              |                                                                         │
│ Tctl: 53C CPU freq (100MHz): [30,30,24,30,28,30,30,45,44,44,44,44,22,30,30,44]                                                                       │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
┌┤ Processes ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│┌┤ #0  AMD Radeon RX 9070 XT ├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐│
││ Name            |  PID  |KFD| VRAM | GTT  |CPU |GFX |COMP|DMA |VCNU|                                                                               ││
││ q2rtx           | 798844|   | 3627M|  121M|  4%|100%|  3%|  0%|  0%|                                                                               ││
││ .firefox-wrappe | 712750|   |  273M|   22M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ .ghostty-wrappe | 727555|   |  128M|   25M|  3%|  0%|  0%|  0%|  0%|                                                                               ││
││ RDD Process     | 712866|   |  118M|    8M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ .pwvucontrol-wr | 713473|   |   78M|   18M|  1%|  0%|  0%|  0%|  0%|                                                                               ││
││ steamwebhelper  | 713873|   |   65M|   58M|  2%|  0%|  0%|  0%|  0%|                                                                               ││
││ steam           | 713632|   |   20M|    6M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ steamwebhelper  | 713941|   |   19M|    8M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ electron        | 713343|   |   18M|    6M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ amdgpu_top      | 798999|   |    0M|    2M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
│└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘│
│┌┤ #1  AMD Radeon Graphics ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐│
││ Name            |  PID  |KFD| VRAM | GTT  |CPU |GFX |COMP|DMA |DEC |ENC |                                                                          ││
││ steamwebhelper  | 713941|   |    0M|    2M|  0%|  0%|  0%|  0%|  0%|  0%|                                                                          ││
││ q2rtx           | 798844|   |    0M|    2M|  4%|  0%|  0%|  0%|  0%|  0%|                                                                          ││
││ amdgpu_top      | 798999|   |    0M|    2M|  0%|  0%|  0%|  0%|  0%|  0%|                                                                          ││
│└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘│
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Looking at the slightly prettier version of the same tool, I also see the junction and memory temperature getting up to 92°C with the edge temperature at the same 66°C shown in the more textual version above.

This version of Quake seems to operate either very inefficiently or full on, all the stops pulled out mode as playing some other gaming titles which are much more modern don’t result in quite the same maximum utilization as this title does.

Anyway, thought I’d follow up on the fan noise from before. It might be the same thing I was hearing when it crashed and went to a blank display yesterday.

As someone who’s been using an AMD card which regularly hit temperature limits and triggered emergency power off when under load; try ssh-ing into the machine from elsewhere and running dmesg -ew or something. At least in my case the emergency shutdown triggered in a way that didn’t let journald write to disk so the logs just stopped, but dmesg and sshd were running 'til the end so I could see the message there.
For me too the junction temperature was very high, though in my case around 100°C, and shutdown probably triggering at 110 to 120°C, which was already worrying.
This won’t solve your issue of course, but it will help you pin it down to “driver issue” or “temperature issue” at least, because from my experience the drivers were actually pretty stable.

I’m now running my former NVidia card again, but I’ve settled on eventually moving to Intel with my next card because apparently NVidia and AMD are equally bad at doing what they do, or at least fail to offer reasonable pricing. YMMV of course.

I’m definitely not hitting any thermal limits under normal crash conditions. I’ve seen crashes both while simply using the media codec engines (browsing video in Firefox) and while doing fairly run of the mill gaming (e.g. Factorio) where the GPU fans aren’t even audible at the time the crash starts at least.

But I also have been able to SSH in at various points during some of these. Sometimes a reboot/poweroff will finish successfully and sometimes the system ends up hanging at that point. It’s not bad to the point of the fglrx days where the monitor going to sleep would crash the kernel. But it’s nowhere near as stable as the Nvidia driver stack (and never has been).

C’est la vie! Hopefully with Valve so heavily involved in the amdgpu driver stack also, things will eventually stabilize to a point where crashing isn’t simply a given. But it’s telling that you have also abandoned your AMD card for the time being. I worry every time I see a discussion where folks are pushing new users to using AMD GPU’s because they are supposedly categorically better in every way. While the licensing (and therefore source and resulting open development) is hands down better on the AMD side, the usability and stability have always been lacking in my experience.

there’s a recently fixed bug upstream that may be causing this. see this previous discussion for a workaround that resolved the issue for me. the fix has been merged into AMD’s drm repo and the patch has been submitted to the mainline but I don’t think there’s been a release yet with the fix.

so check dmesg and see if this is what you’re experiencing, because if so, it’s a relatively simple workaround for the time being. bit annoying that this made it into a release but at least the fix isn’t involved.

2 Likes

the fix I mentioned was released in 6.15.2 and 6.12.33.

3 Likes

I’ll mark this as “solved”. I’d say it was really around 6.15.3 and kernel firmware 20250613 that things started to stabilize for me on the 9070 XT.

Having said that, I’ve also disabled the integrated GPU on my CPU via BIOS and switched entirely to Wayland on the machine in question, so it wasn’t entirely painless or without its issues.

I didn’t find a single display manager that properly supports PAM with the 2FA Google authentication library active. I’m simply firing up sway automatically via my .zshrc if I’m logging in via tty1 since the basic login getty handles that correctly. I realize I could still use lightdm (which also handles the PAM stuff correctly) and start a Wayland session, but I’m trying to keep it all Wayland as much as possible to avoid potential issues wherever possible.

Ditto the 2FA problems with the screen lockers I’ve looked at. I’m just not doing anything with my screen anymore. Not having xscreensaver is potentially a bridge too far. Time will tell.

Anyway, I did want to say things have gotten better from a stability standpoint. I’m having less issues now than I did with a 6800 XT a few years ago (still well into what should have been very mature drivers at that point; and that card runs stably elsewhere), albeit with an entire graphical stack overhaul to boot.

3 Likes

I’m glad things seem to potentially be looking up for you. I also am upgrading from an RTX 3070 (though not Ti), use ZFS (though not for NixOS itself), have a Lenovo laptop (Legion 5 Pro) and have a partner that uses Windows. My Gigabyte 9070 XT Gaming OC (one of the few models that will fit in my case without the removal of a front fan) arrived yesterday.

I currently dual-boot NixOS and Windows 10. I’m only a basic user of NixOS, staying on the stable channel, declining to get into Flakes so far and using Devbox for dev environments.

I would like to try NixOS with the new GPU at some point, but I’m thinking about making this machine a Windows gaming-only one for now and using my laptop or another computer for NixOS/work. I wouldn’t know how to apply the workaround linked to above, though NixOS 25.05 seems to be using kernel 6.12.37 which I think means I shouldn’t need to. I suppose I could whack the card in and see what happens…

The wiki seems to suggest that all I would need is:

hardware.graphics = {
  enable = true;
  enable32Bit = true;
};

Do you think it would be worth trying in the stable channel, or should I go to unstable and use the following?

I don’t think running on anything much older than 6.15 with this card is going to be a very enjoyable experience.

I am not using any of the kernelParams on 6.15.6 currently and the system has been stable so far. But I also haven’t left my system on for more than a 24 hour stretch since my initial problem report where the card went a bit nuts. I’m a little hesitant to leave it unattended and have it potentially damage the card with excessive heat for an extended period of time.

As for which NixOS versions might be usable currently, it looks like only nixos-25.05 or the master branch directly are usable at the moment as nixos-unstable doesn’t appear to have the necessary bits:

% git checkout nixos-25.05 && jq -r '.. | .version? | select(. != null)' < pkgs/os-specific/linux/kernel/kernels-org.json | sort | grep ^6; grep '^  version' pkgs/by-name/li/linux-firmware/package.nix
Updating files: 100% (14017/14017), done.
Switched to branch 'nixos-25.05'
Your branch is up to date with 'origin/nixos-25.05'.
6.1.144
6.12.37
6.13.12
6.14.11
6.15.6
6.16-rc5
6.6.97
  version = "20250708";

% git checkout nixos-unstable && jq -r '.. | .version? | select(. != null)' < pkgs/os-specific/linux/kernel/kernels-org.json | sort | grep ^6; grep '^  version' pkgs/by-name/li/linux-firmware/package.nix
Updating files: 100% (25923/25923), done.
Switched to branch 'nixos-unstable'
Your branch is behind 'origin/nixos-unstable' by 93903 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
6.1.124
6.11.11
6.12.9
6.13-rc6
6.6.70
grep: pkgs/by-name/li/linux-firmware/package.nix: No such file or directory

% git checkout master && jq -r '.. | .version? | select(. != null)' < pkgs/os-specific/linux/kernel/kernels-org.json | sort | grep ^6; grep '^  version' pkgs/by-name/li/linux-firmware/package.nix
Updating files: 100% (31853/31853), done.
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
6.1.145
6.12.38
6.15.6
6.16-rc6
6.6.98
  version = "20250708";

nixos-unstable does not appear to define any new enough kernels yet, whereas the current stable and master branches do. So using something like:

boot = {
  initrd.kernelModules = [ "amdgpu" "zfs" ];
  kernelPackages = pkgs.linuxPackages_6_15;
  supportedFilesystems = [ "zfs" ];
};
graphics = {
  enable = true;
}

would probably be sufficient on the current nixos-25.05 stable branch.

2 Likes

Thanks very much. Things (general desktop use - no 3D stuff yet) seem to be okay so far. The fans on the card do not spin, as is also the case in Windows when the card isn’t doing much.

My config:

  hardware.graphics.enable = true;
  boot.initrd.kernelModules = [ "amdgpu" ]; # ZFS not used for NixOS itself
  boot.kernelPackages = pkgs.linuxPackages_6_15;

(I already had hardware.graphics.enable, and I couldn’t find a graphics.enable option when searching options.)

One issue I am having is this. It seems mouse polling rates above 1,000Hz, and even 1,000Hz if in high speed mode, results in inputs being lost - movement is extremely slow and/or the cursor will sometimes jump from one point to another in a straight line. I guess that’s a kernel thing rather than a GPU thing.

Sorry, yes, hardware.graphics.enable is the correct option.

As for the higher polling rate for your mouse, I can’t speak to that. My Logitech G600 isn’t exactly the cutting edge of input technology! :slight_smile:

Note that adding your GPU driver to initrd generally doesn’t fix anything (except maybe kexec) but makes your initrds much larger (like 30M larger, which is considerable since each initrd takes up space on your ESP).

Probably better off setting this to pkgs.linuxPackages_latest rather than choosing a specific kernel.

1 Like

Using pkgs.linuxPackages_latest isn’t really an option if you’re also using ZFS.

Future-proofing is important. History has taught me if I don’t do something like this, I will forget why I set it and even whether it was an upgrade or a downgrade at the time I set it:

boot.kernelPackages = if lib.versionOlder pkgs.linuxPackages.kernel.version pkgs.linuxPackages_6_15.kernel.version
                      then
                        pkgs.linuxPackages_6_15
                      else
                        builtins.warn "remove kernel version setting in ${__curPos.file} line ${__curPos.line}" pkgs.linuxPackages;
1 Like

Thanks, all. I’ve removed the amdgpu boot kernel modules line and added that use of the versionOlder function, and all seems well.

1 Like

I’ve just upgraded to NixOS Stable 25.11 which meant using kernel 6.12.59 (the default for me with ZFS) as 6.17 doesn’t support ZFS, and things seem okay with my 9070XT after an hour or so of 2D desktop usage. I’ll keep an eye on things tomorrow during a longer session.

I still find minimising a window in Pantheon often requires me to switch workspace away and back to get rid of the image of the window, but I’m used to that.

Slightly off topic, but I’m using kernel 6.17.10 with ZFS as I type. But not using the 9070 XT GPU…

1 Like

Oh, that’s good to know. I’ll stick with the default kernel for stable so that I don’t have to worry about specific kernels being out of support and preventing a rebuild again.