Hello!
Disclaimer: it’s my first post on any sort of forum, so I’m open for posting feedback
I’m running into some strange nvidia power management behavior on my lenovo legion laptop when on battery power. I’m running X on my integrated amd gpu, and I’ve followed the nixos nvidia page.
While connected to the charger, it works fine; the card enters its sleep state (confirmed through “/sys/bus/pci/devices/0000:01:00.0/power/runtime_status,” where the address corresponds to my card, as well as other sources).
However, when I unplug the charger, it mysteriously goes into “active” mode (from D3cold to D0) and stays there. I’ve disabled tlp and any other potential things that could conflict with it in my configuration; it’s basically down to barebones.
There’s a second PCI device which corresponds to the nvidia audio subsystem, which also needs to have power management enabled for the nvidia card to go to sleep (I think this happens automatically with the nixos config, but I’ve also been able to force it with “echo auto | sudo tee /sys/bus/pci/devices/0000:01:00.1/power/control”).
When my laptop is unplugged (or when it boots unplugged), something removes the audio pci device, and the graphics pci device goes into D0 active power mode permanently, even though it still should have power management enabled (confirmed with “/sys/bus/pci/devices/0000:01:00.0/power/control” returning “auto”)
I’ve done some investigating, and if I have “udevadm monitor” running when I unplug it, it seems the kernel is removing the device:
# udevadm monitor:
KERNEL[960.782975] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/controlC0 (sound)
KERNEL[960.783066] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input17/event13 (input)
KERNEL[960.788150] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input17 (input)
KERNEL[960.788184] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input16/event12 (input)
UDEV [960.797849] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/controlC0 (sound)
UDEV [960.798570] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input17/event13 (input)
UDEV [960.799050] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input17 (input)
UDEV [960.799881] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input16/event12 (input)
KERNEL[960.801130] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input16 (input)
KERNEL[960.801182] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input15/event11 (input)
UDEV [960.801982] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input16 (input)
UDEV [960.802138] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input15/event11 (input)
KERNEL[960.809736] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input15 (input)
KERNEL[960.809837] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input14/event10 (input)
UDEV [960.810719] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input15 (input)
UDEV [960.811042] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input14/event10 (input)
KERNEL[960.821528] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input14 (input)
KERNEL[960.821646] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/hwC0D0 (sound)
KERNEL[960.821742] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D9p (sound)
KERNEL[960.821785] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D8p (sound)
KERNEL[960.821869] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D7p (sound)
KERNEL[960.821908] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D3p (sound)
KERNEL[960.822035] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0 (sound)
UDEV [960.822670] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D9p (sound)
UDEV [960.822695] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/input14 (input)
UDEV [960.822710] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/hwC0D0 (sound)
UDEV [960.824093] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D3p (sound)
UDEV [960.824113] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D7p (sound)
KERNEL[960.825217] unbind /devices/pci0000:00/0000:00:01.1/0000:01:00.1/hdaudioC0D0 (hdaudio)
KERNEL[960.825253] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/hdaudioC0D0 (hdaudio)
UDEV [960.825298] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/pcmC0D8p (sound)
KERNEL[960.825634] unbind /devices/pci0000:00/0000:00:01.1/0000:01:00.1 (pci)
UDEV [960.825674] unbind /devices/pci0000:00/0000:00:01.1/0000:01:00.1/hdaudioC0D0 (hdaudio)
UDEV [960.825723] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0 (sound)
KERNEL[960.825771] remove /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:00/device:02/wakeup/wakeup14 (wakeup)
KERNEL[960.825794] remove /devices/virtual/devlink/pci:0000:01:00.0--pci:0000:01:00.1 (devlink)
KERNEL[960.825843] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1 (pci)
UDEV [960.826038] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1/hdaudioC0D0 (hdaudio)
UDEV [960.826167] remove /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:00/device:02/wakeup/wakeup14 (wakeup)
UDEV [960.826191] remove /devices/virtual/devlink/pci:0000:01:00.0--pci:0000:01:00.1 (devlink)
UDEV [960.826329] unbind /devices/pci0000:00/0000:00:01.1/0000:01:00.1 (pci)
UDEV [960.826501] remove /devices/pci0000:00/0000:00:01.1/0000:01:00.1 (pci)
Also, dmesg at that same moment shows that (possibly the same culprit?) tries to remove the graphics card itself:
NVRM: Attempting to remove device 0000:01:00.0 with non-zero usage count!
I’m not sure if this is the right place to post this, as it’s not unlikely that it’s perhaps a driver or laptop model / bios specific problem, but hopefully someone can help
Also, here’s my nvidia related section of configuration.nix:
# Enable openGL
hardware.opengl = {
enable = true;
driSupport = true;
driSupport32Bit = true;
};
# Load drivers for X
services.xserver.videoDrivers = [ "amdgpu" ];
hardware.nvidia = {
modesetting.enable = true;
powerManagement.enable = true;
powerManagement.finegrained = true;
# Use proprietary driver
open = false;
nvidiaSettings = true;
# Select driver
package = config.boot.kernelPackages.nvidiaPackages.production;
# enable and configure PRIME
prime = {
offload.enable = true;
offload.enableOffloadCmd = true;
amdgpuBusId = "PCI:6:0:0";
nvidiaBusId = "PCI:1:0:0";
};
};
boot.kernelModules = [ "amdgpu" "nvidia" ];
Something to note is that if I add options nvidia "NVreg_EnableGpuFirmware=1"
to options nvidia "NVreg_EnableGpuFirmware=1"
in configuration.nix, then both the graphics and the audio pci devices are removed (successfully this time).
Also, I examined all existing udev rules on my system and ruled that out as a source for this behaviour (and udevadm shows that udev reacts to the kernel doing stuff, not the other way around).
My hardware is a Lenovo Legion Slim 5 14APH8 with AMD Ryzen 7 7840HS (+ Radeon 780M) as well as an NVIDIA 4060.
I should also add that I can successfully use nvidia offloading, so at least no problems there.
Sorry for the long post, and thanks for the help.