AMD RX 7700 XT not being detected properly

Relatively new to NixOS, so sorry if I’m missing something simple here.

I’ve just finished building a new PC with the following hardware:

  • CPU: AMD Ryzen 5 7600X
  • Mobo: Gigabyte B650 Gaming X AX
  • GPU: AMD RX 7700 XT

NixOS installed smoothly and I was able to pull over a full configuration from another device without issue. However, I then realised that while I was requesting a Wayland session of GNOME, I was actually getting an X11 session. As I dug into this, I realised that my entire GPU wasn’t being recognised properly, which can cause Wayland to not load correctly.

lspci returns the following:

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14d8
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14db
00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14db
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da
00:02.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14db
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14da
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14dd
00:08.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14dd
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 71)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 14e7
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev 11)
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (rev 11)
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 747e (rev ff)
03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device ab30
04:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
05:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f4 (rev 01)
06:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:07.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:0b.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:0c.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
06:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43f5 (rev 01)
0e:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
0f:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter
10:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43f7 (rev 01)
11:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] Device 43f6 (rev 01)
12:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Raphael (rev c7)
12:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller
12:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] VanGogh PSP/CCP
12:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b6
12:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b7
12:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller
13:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b8

Note the Device 747e (rev ff), which seems to correspond to the id of the chip used in the RX 7700 XT and RX 7800 XT.

Similarly, inxi -Gxx returns the following, again with no specifics for what the GPU actually is:

Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] vendor: XFX Limited driver: amdgpu v: kernel bus-ID: 03:00.0 
           chip-ID: 1002:747e 
           Device-2: Advanced Micro Devices [AMD/ATI] Raphael vendor: Gigabyte driver: amdgpu v: kernel bus-ID: 12:00.0 
           chip-ID: 1002:164e 
           Display: x11 server: X.org 1.21.1.8 compositor: gnome-shell driver: loaded: amdgpu note: n/a (using device driver) 
           resolution: <missing: xdpyinfo> 
           OpenGL: renderer: AMD Radeon Graphics (gfx1101 LLVM 15.0.7 DRM 3.54 6.5.3) v: 4.6 Mesa 23.0.3 direct render: Yes

Trying to run any GPU monitoring tools usually fails. radeontop returns Unknown Radeon card. <= R500 won't work, new cards might. and fails to display much at all, radeon-profile can’t even find anything to inspect, amdgpu_top seems to be able to read everything about the card other than its name, and nvtop can’t get much either:

Device 1 [AMD Radeon Graphics] PCIe GEN N/A RX: N/A TX: N/A

This is quite problematic, since I can’t reliably monitor temperatures or really control anything about the GPU, on top of seemingly not being able to start a wayland session. I’ve followed everything on AMD GPU - NixOS Wiki and the amdgpu driver seems to be starting correctly. I’ve run everything without the amdgpu drivers and get exactly the same results. I’ve updated to the latest kernel on 23.05 and also got no change.

Could this be mesa-related in some way? I’ve read that people were having issues back when the 6900 XT released and required a version of mesa from unstable, but I haven’t been able to find anything about that for these cards.

Full configs are at GitHub - joegilkes/nixos-config, with hardware specific configuration in https://github.com/joegilkes/nixos-config/tree/main/systems/x86_64-linux/timber-hearth.

The wiki and nixos-hardware aren’t being particularly helpful here… Like the hardware.amdgpu.amdvlk option is just actually a bad idea these days, as the mesa radv drivers are the good ones. And services.xserver.videoDrivers = [ "amdgpu" ]; should be modesetting, not amdgpu, though thankfully that seems to have been fixed in nixos-hardware (but not the wiki).

It says gfx1101, which is the 7700 XT.

Thanks for clearing that up, I had noticed that it seemed a bit inconsistent but wasn’t sure which source was correct. I have tried both with and without both settings though (with full reboots in between, in case that should matter) without any change in either the Wayland situation or the GPU identification.

Good catch, I hadn’t spotted that. However, the main problem beyond lack of proper naming is the lack of sensor data and fan control, since most programs can’t identify what the GPU is.

I’m currently trying a move to the version of mesa in unstable. It may take a while since it looks like everything with mesa as a dependency is getting recompiled from scratch though…

Yea, and unfortunately you can’t shortcut this by changing just the system drivers, because everything has to be linked directly to the libgbm of the current mesa. See Revert "nixos/opengl: add mesaPackage option" by K900 · Pull Request #225325 · NixOS/nixpkgs · GitHub

I’d read this yeah, I was trying to go along the lines of what you described here mesa: using a newer mesa version for drivers crashes DEs under wayland · Issue #223729 · NixOS/nixpkgs · GitHub but I ran out of memory to the point that I was spending more CPU time moving data into/out of swap than actually compiling.

Moving everything in my config to unstable should work though, right? I was having a go last night, but updated my flake.lock and unfortunately ran into this bug error: path is not valid using nix ==2.18.0 · Issue #9052 · NixOS/nix · GitHub, which slowed things down a lot

Yea, using nixos-unstable for the whole system is probably the best way to get very recent versions of stuff like mesa. As for the Nix bug, just switch to Nix 2.17 until it’s fixed.

Moving to unstable seems to have helped a lot, GNOME is now correctly starting on Wayland and I’m able to get significantly more information from the GPU.

radeon-profile and nvtop now display GPU information properly, with the exception of fan speed (fan control within radeon-profile also doesn’t work), but it seems like that’s a driver-level problem with RDNA3 cards that is being worked on 7900 XTX Unable to set fan speed (#2402) · Issues · drm / amd · GitLab. radeontop still can’t identify the card properly, but this appears to be a problem with the program rather than the underlying driver UNKNOWN_CHIP with Navi31 · Issue #151 · clbr/radeontop · GitHub.

I think that’s as much of a fix as I’m going to get for now. I guess we can assume that the problem was mesa, but without being able to change its version in isolation it’s very hard to tell. Thanks for the help!