AMD 9070 XT Linux experiences - early June 2025

I’m not really expecting much help necessarily by posting this. It’s more of a “state of the union” style post to document my own experiences with the AMD 9070 XT GPU as I haven’t seen a whole lot of NixOS 9070 users thus far.

The TLDR synopsis here at the top is “expect instability”. I’ll skip the really long rant here about being an Nvidia user since the TNT 2 days on Linux and the fact that the Nvidia driver stack has always been far more stable than the former fglrx or current amdgpu stack has ever been.

However, given the general insanity of the 12VHPWR connectors on the 50 series GeForce cards, I simply can’t justify putting something like that in my house. Having said that, part of the reason I’m posting this here now is because I ran into a somewhat similarly concerning issue with the 9070 also.

For the NixOS specific side of things, I’m not doing anything odd or crazy on this system:

boot = {
  initrd.kernelModules = [ "amdgpu" ];
  kernelPackages = pkgs.linuxPackages_6_14;
  kernelParams = [
    "amdgpu.ppfeaturemask=0xfffd3fff"
    "split_lock_detect=off"
  ];
};
graphics = {
  enable = true;
}

I think that’s about it other than using i3 via services.displayManager.defaultSession = "xsession" and having Steam installed via programs.steam.enable = true.

The good news is that things are great, at least for a day or so. Games in Steam run beautifully and all my various hardware accelerated video encoding/decoding needs work great. I initially was running this 9070 XT in a system built around an AMD 5800X, but in part due to the instability, I moved that system back to my older 3070 Ti and built a dedicated 9950X3D box. So even without the Jellyfin transcoding load from that older system (my NAS/Jellyfin box), the experience with the newer system is the same. Everything works for about a day, and then eventually, the system crashes.

The crash always seems to start with my mouse input starting to stagger for a few seconds like the system is under extreme load, followed eventually by a full graphical lockup or even the screen going blank and eventually the video card disabling the display port connection to the monitor entirely (as the monitor ends up powering off). This is partly why I’ve settled on the current kernel CLI parameters above as without those, the system seems to crash even faster. Additionally, I’ve had to disable the hardware video acceleration features in Steam, otherwise the UI likes to freeze and stop updating itself when I flip away to another i3 desktop and back to Steam. I can force the UI to start updating again if I open another window on the same desktop causing the Steam window itself to resize. But this is required every single time I flip away and back. Admittedly, this might be specific to i3 (I am using picom for whatever that’s worth).

I should also mention I’m using ZFS everywhere for everything. This has caused me zero problems with my Nvidia cards in the past. But I obviously can’t assume that necessarily for the AMD driver stack.

Now, to be fair, I saw very similar types of problems with a 6800 XT previously which ended up going to my partner’s Windows machine as a result. And even my Lenovo laptop with a 6900 HS seems to like to lock up within roughly the same time frame usually. So it seems like the amdgpu driver stack is questionable all around, regardless of hardware generation to some degree. About the only amdgpu based system I don’t have this particular experience with is my Steam Deck, although it’s usually not running for more than a few hours any given day before I shut it down again.

The concerning part in all of this was that yesterday this happened on my 9950X3D system and after the display blanked and powered down, some of my fans spun up to ludicrous levels in my system like something was trying to massively overheat. I ended up holding down the power button to kill power entirely as there was so much noise and vibration, it sounded like something was about to damage itself. Since I didn’t feel like I had time to pull the case apart, I’m not positive it was the GPU fans trying to rapidly disassemble themselves. It could have just as easily been the CPU or case fans if the CPU itself was the one trying to eat itself alive.

Anyway, I know 6.15 has even more amdgpu fixes which I will try just as soon as ZFS officially supports it. But currently running mesa-25.1.1 and kernel 6.14.9 (as of yesterday when this happened anyway; on 6.14.10 now), this card isn’t what I’d call stable by any means. It will probably be a frustrating experience for folks expecting things to “just work”.

I’d love to hear from other NixOS users, especially if you’re having a nearly 100% stable experience with this GPU to see how things differ potentially. But I’m also assuming this is all just par for the course for the amdgpu driver stack to some extent since I’ve never seen it be a paragon of stability in any of its various incarnations. I am hoping it at least stabilizes to the point where I don’t have to assume my system will crash about once a day or so if I avoid shutting it down.

1 Like

It might have been the GPU fan. I can get pretty close to what I think I was hearing if I run Quake II RTX on this system:

amdgpu_top v0.10.5
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│GPU Name                                | PCI Bus        |    VRAM Usage    |                                                                         │
│SCLK    MCLK    VDDGFX  Power           | GFX% UMC%Media%|     GTT Usage    |                                                                         │
│Temp    Fan     ECC_UnCorr. Throttle_Status                                 |                                                                         │
│------------------------------------------------------------------------------------------------------------------------------------------------------│
│#0  [AMD Radeon RX 9070 XT    ](gfx1201)| 0000:03:00.0   |  5012/ 16304 MiB |                                                                         │
│2954MHz 1258MHz 1006mV  329/330W        | 100%  33%   0% |   264/ 30943 MiB |                                                                         │
│  66C      0RPM [      N/A] [PPT0]                                          |                                                                         │
│------------------------------------------------------------------------------------------------------------------------------------------------------│
│#1  [AMD Radeon Graphics      ](gfx1036)| 0000:17:00.0   |    24/  2048 MiB |                                                                         │
│ 600MHz 3000MHz 1035mV   52/___W        |   0% ___%   0% |    15/ 30943 MiB |                                                                         │
│  49C   ____RPM [      N/A] []                                              |                                                                         │
│ Tctl: 53C CPU freq (100MHz): [30,30,24,30,28,30,30,45,44,44,44,44,22,30,30,44]                                                                       │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
┌┤ Processes ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│┌┤ #0  AMD Radeon RX 9070 XT ├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐│
││ Name            |  PID  |KFD| VRAM | GTT  |CPU |GFX |COMP|DMA |VCNU|                                                                               ││
││ q2rtx           | 798844|   | 3627M|  121M|  4%|100%|  3%|  0%|  0%|                                                                               ││
││ .firefox-wrappe | 712750|   |  273M|   22M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ .ghostty-wrappe | 727555|   |  128M|   25M|  3%|  0%|  0%|  0%|  0%|                                                                               ││
││ RDD Process     | 712866|   |  118M|    8M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ .pwvucontrol-wr | 713473|   |   78M|   18M|  1%|  0%|  0%|  0%|  0%|                                                                               ││
││ steamwebhelper  | 713873|   |   65M|   58M|  2%|  0%|  0%|  0%|  0%|                                                                               ││
││ steam           | 713632|   |   20M|    6M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ steamwebhelper  | 713941|   |   19M|    8M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ electron        | 713343|   |   18M|    6M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
││ amdgpu_top      | 798999|   |    0M|    2M|  0%|  0%|  0%|  0%|  0%|                                                                               ││
│└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘│
│┌┤ #1  AMD Radeon Graphics ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐│
││ Name            |  PID  |KFD| VRAM | GTT  |CPU |GFX |COMP|DMA |DEC |ENC |                                                                          ││
││ steamwebhelper  | 713941|   |    0M|    2M|  0%|  0%|  0%|  0%|  0%|  0%|                                                                          ││
││ q2rtx           | 798844|   |    0M|    2M|  4%|  0%|  0%|  0%|  0%|  0%|                                                                          ││
││ amdgpu_top      | 798999|   |    0M|    2M|  0%|  0%|  0%|  0%|  0%|  0%|                                                                          ││
│└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘│
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Looking at the slightly prettier version of the same tool, I also see the junction and memory temperature getting up to 92°C with the edge temperature at the same 66°C shown in the more textual version above.

This version of Quake seems to operate either very inefficiently or full on, all the stops pulled out mode as playing some other gaming titles which are much more modern don’t result in quite the same maximum utilization as this title does.

Anyway, thought I’d follow up on the fan noise from before. It might be the same thing I was hearing when it crashed and went to a blank display yesterday.

As someone who’s been using an AMD card which regularly hit temperature limits and triggered emergency power off when under load; try ssh-ing into the machine from elsewhere and running dmesg -ew or something. At least in my case the emergency shutdown triggered in a way that didn’t let journald write to disk so the logs just stopped, but dmesg and sshd were running 'til the end so I could see the message there.
For me too the junction temperature was very high, though in my case around 100°C, and shutdown probably triggering at 110 to 120°C, which was already worrying.
This won’t solve your issue of course, but it will help you pin it down to “driver issue” or “temperature issue” at least, because from my experience the drivers were actually pretty stable.

I’m now running my former NVidia card again, but I’ve settled on eventually moving to Intel with my next card because apparently NVidia and AMD are equally bad at doing what they do, or at least fail to offer reasonable pricing. YMMV of course.

I’m definitely not hitting any thermal limits under normal crash conditions. I’ve seen crashes both while simply using the media codec engines (browsing video in Firefox) and while doing fairly run of the mill gaming (e.g. Factorio) where the GPU fans aren’t even audible at the time the crash starts at least.

But I also have been able to SSH in at various points during some of these. Sometimes a reboot/poweroff will finish successfully and sometimes the system ends up hanging at that point. It’s not bad to the point of the fglrx days where the monitor going to sleep would crash the kernel. But it’s nowhere near as stable as the Nvidia driver stack (and never has been).

C’est la vie! Hopefully with Valve so heavily involved in the amdgpu driver stack also, things will eventually stabilize to a point where crashing isn’t simply a given. But it’s telling that you have also abandoned your AMD card for the time being. I worry every time I see a discussion where folks are pushing new users to using AMD GPU’s because they are supposedly categorically better in every way. While the licensing (and therefore source and resulting open development) is hands down better on the AMD side, the usability and stability have always been lacking in my experience.

there’s a recently fixed bug upstream that may be causing this. see this previous discussion for a workaround that resolved the issue for me. the fix has been merged into AMD’s drm repo and the patch has been submitted to the mainline but I don’t think there’s been a release yet with the fix.

so check dmesg and see if this is what you’re experiencing, because if so, it’s a relatively simple workaround for the time being. bit annoying that this made it into a release but at least the fix isn’t involved.

1 Like