Dual GPU setup: nvidia-smi shows gnome-shell and doesn't sleep

Hi all,

(also posted on Reddit)

I’m setting up my Lenovo Thinkpad P1 gen 2 with NixOS, and right now I’m struggling to improve battery life by only using the Nvidia GPU when called for (offload mode). When I run nvidia-smi, it shows gnome-shell running:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.02              Driver Version: 545.29.02    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro T1000                   On  | 00000000:01:00.0  On |                  N/A |
| N/A   42C    P8               3W /  35W |     19MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     11966      G   ...6m-gnome-shell-45.2/bin/gnome-shell        1MiB |
+---------------------------------------------------------------------------------------+

Also, running cat /sys/bus/pci/devices/0000\:01\:00.0/power_state returns D0, while I believe it should return D3Cold when it is sleeping. The Nvidia Quadro T1000 is a Turing generation card according to Wikipedia.

I want everything (including gnome-shell) to run on the Intel iGPU and let the Nvidia card go to sleep.

Does anybody know how I can fix this?

/etc/nixos/configuration.nix
# Edit this configuration file to define what should be installed on
# your system.  Help is available in the configuration.nix(5) man page
# and in the NixOS manual (accessible by running ‘nixos-help’).

{ config, lib, pkgs, ... }:

{
  imports =
    [ # Include the results of the hardware scan.
      ./hardware-configuration.nix
    ];

  # Bootloader.
  boot.loader.systemd-boot.enable = true;
  boot.loader.efi.canTouchEfiVariables = true;

  networking.hostName = "nixos"; # Define your hostname.
  # networking.wireless.enable = true;  # Enables wireless support via wpa_supplicant.

  # Configure network proxy if necessary
  # networking.proxy.default = "http://user:password@proxy:port/";
  # networking.proxy.noProxy = "127.0.0.1,localhost,internal.domain";

  # Enable networking
  networking.networkmanager.enable = true;

  ### Copied from https://nixos.wiki/wiki/Nvidia

  # Enable OpenGL
  hardware.opengl = {
    enable = true;
    driSupport = true;
    driSupport32Bit = true;
  };

  # Load nvidia driver for Xorg and Wayland
  services.xserver.videoDrivers = ["nvidia"];

  hardware.nvidia = {

    # Modesetting is required.
    modesetting.enable = true;
    
    ### See https://github.com/NixOS/nixos-hardware/issues/348#issuecomment-997123102
    # Nvidia power management. Experimental, and can cause sleep/suspend to fail.
    powerManagement.enable = true;
    # Fine-grained power management. Turns off GPU when not in use.
    # Experimental and only works on modern Nvidia GPUs (Turing or newer).
    powerManagement.finegrained = true;
    nvidiaPersistenced = true;

    # Use the NVidia open source kernel module (not to be confused with the
    # independent third-party "nouveau" open source driver).
    # Support is limited to the Turing and later architectures. Full list of 
    # supported GPUs is at: 
    # https://github.com/NVIDIA/open-gpu-kernel-modules#compatible-gpus 
    # Only available from driver 515.43.04+
    # Currently alpha-quality/buggy, so false is currently the recommended setting.
    open = false;

    # Enable the Nvidia settings menu,
	# accessible via `nvidia-settings`.
    nvidiaSettings = false;

    # Optionally, you may need to select the appropriate driver version for your specific GPU.
    package = config.boot.kernelPackages.nvidiaPackages.stable;

    prime = {
      offload = {
        enable = true;
        enableOffloadCmd = true;
      };
      # Make sure to use the correct Bus ID values for your system!
      nvidiaBusId = "PCI:1:0:0";
      intelBusId = "PCI:0:2:0";
    };
  };

  # Set your time zone.
  time.timeZone = "Europe/Amsterdam";

  # Select internationalisation properties.
  i18n.defaultLocale = "en_US.UTF-8";

  i18n.extraLocaleSettings = {
    LC_ADDRESS = "nl_NL.UTF-8";
    LC_IDENTIFICATION = "nl_NL.UTF-8";
    LC_MEASUREMENT = "nl_NL.UTF-8";
    LC_MONETARY = "nl_NL.UTF-8";
    LC_NAME = "nl_NL.UTF-8";
    LC_NUMERIC = "nl_NL.UTF-8";
    LC_PAPER = "nl_NL.UTF-8";
    LC_TELEPHONE = "nl_NL.UTF-8";
    LC_TIME = "nl_NL.UTF-8";
  };

  # Enable the X11 windowing system.
  services.xserver.enable = true;

  # Enable the GNOME Desktop Environment.
  services.xserver.displayManager.gdm.enable = true;
  services.xserver.desktopManager.gnome.enable = true;

  # Configure keymap in X11
  services.xserver = {
    layout = "us";
    xkbVariant = "";
  };

  # Enable CUPS to print documents.
  services.printing.enable = true;

  # Enable sound with pipewire.
  sound.enable = true;
  hardware.pulseaudio.enable = false;
  security.rtkit.enable = true;
  services.pipewire = {
    enable = true;
    alsa.enable = true;
    alsa.support32Bit = true;
    pulse.enable = true;
    # If you want to use JACK applications, uncomment this
    #jack.enable = true;

    # use the example session manager (no others are packaged yet so this is enabled by default,
    # no need to redefine it in your config for now)
    #media-session.enable = true;
  };

  # Enable touchpad support (enabled default in most desktopManager).
  # services.xserver.libinput.enable = true;

  # Define a user account. Don't forget to set a password with ‘passwd’.
  users.users.REDACTED = {
    isNormalUser = true;
    description = "REDACTED";
    extraGroups = [ "networkmanager" "wheel" ];
    packages = with pkgs; [
      firefox
    #  thunderbird
    ];
  };

  # Allow unfree packages
  nixpkgs.config.allowUnfree = true;

  # List packages installed in system profile. To search, run:
  # $ nix search wget
  environment.systemPackages = with pkgs; [
  #  vim # Do not forget to add an editor to edit configuration.nix! The Nano editor is also installed by default.
  #  wget
    git
    killall
  ];

  # Some programs need SUID wrappers, can be configured further or are
  # started in user sessions.
  # programs.mtr.enable = true;
  # programs.gnupg.agent = {
  #   enable = true;
  #   enableSSHSupport = true;
  # };

  # List services that you want to enable:
  ### As found here: https://github.com/NixOS/nixos-hardware/blob/8e34f33464d77bea2d5cf7dc1066647b1ad2b324/lenovo/thinkpad/p1/default.nix
  # services.fprintd.enable = true;

  services.power-profiles-daemon.enable = false;
  services.tlp = {
    enable = true;
    settings = {
        CPU_SCALING_GOVERNOR_ON_AC = "performance";
        CPU_SCALING_GOVERNOR_ON_BAT = "powersave";

        CPU_ENERGY_PERF_POLICY_ON_BAT = "power";
        CPU_ENERGY_PERF_POLICY_ON_AC = "performance";

        CPU_MIN_PERF_ON_AC = 0;
        CPU_MAX_PERF_ON_AC = 100;
        CPU_MIN_PERF_ON_BAT = 0;
        CPU_MAX_PERF_ON_BAT = 40;

       #Optional helps save long term battery health
       START_CHARGE_THRESH_BAT0 = 80; # 40 and bellow it starts to charge
       STOP_CHARGE_THRESH_BAT0 = 90; # 80 and above it stops charging

      };
  };

  # Enable the OpenSSH daemon.
  # services.openssh.enable = true;

  # Open ports in the firewall.
  # networking.firewall.allowedTCPPorts = [ ... ];
  # networking.firewall.allowedUDPPorts = [ ... ];
  # Or disable the firewall altogether.
  # networking.firewall.enable = false;

  # This value determines the NixOS release from which the default
  # settings for stateful data, like file locations and database versions
  # on your system were taken. It‘s perfectly fine and recommended to leave
  # this value at the release version of the first install of this system.
  # Before changing this value read the documentation for this option
  # (e.g. man configuration.nix or on https://nixos.org/nixos/options.html).
  system.stateVersion = "23.11"; # Did you read the comment?

}
Neofetch
          ▗▄▄▄       ▗▄▄▄▄    ▄▄▄▖            REDACTED@nixos 
          ▜███▙       ▜███▙  ▟███▛            ---------------- 
           ▜███▙       ▜███▙▟███▛             OS: NixOS 23.11.2413.32f63574c85f (Tapir) x86_64 
            ▜███▙       ▜██████▛              Host: LENOVO 20QUS00000 
     ▟█████████████████▙ ▜████▛     ▟▙        Kernel: 6.1.69 
    ▟███████████████████▙ ▜███▙    ▟██▙       Uptime: 30 mins 
           ▄▄▄▄▖           ▜███▙  ▟███▛       Packages: 954 (nix-system), 325 (nix-user) 
          ▟███▛             ▜██▛ ▟███▛        Shell: bash 5.2.15 
         ▟███▛               ▜▛ ▟███▛         Resolution: 1920x1080, 1920x1080 
▟███████████▛                  ▟██████████▙   DE: GNOME 45.2 (Wayland) 
▜██████████▛                  ▟███████████▛   WM: Mutter 
      ▟███▛ ▟▙               ▟███▛            WM Theme: Adwaita 
     ▟███▛ ▟██▙             ▟███▛             Theme: Adwaita [GTK2/3] 
    ▟███▛  ▜███▙           ▝▀▀▀▀              Icons: Adwaita [GTK2/3] 
    ▜██▛    ▜███▙ ▜██████████████████▛        Terminal: kgx 
     ▜▛     ▟████▙ ▜████████████████▛         CPU: Intel i7-9750H (12) @ 4.500GHz 
           ▟██████▙       ▜███▙               GPU: NVIDIA Quadro T1000 Mobile 
          ▟███▛▜███▙       ▜███▙              GPU: Intel CoffeeLake-H GT2 [UHD Graphics 630] 
         ▟███▛  ▜███▙       ▜███▙             Memory: 4396MiB / 31879MiB 
         ▝▀▀▀    ▀▀▀▀▘       ▀▀▀▘
Related sources I found
  1. Power Managment with nvidia GPU - #14 by lovirent
  2. Laptop - NixOS Wiki
  3. Nvidia - NixOS Wiki
  4. https://github.com/NixOS/nixos-hardware/blob/8e34f33464d77bea2d5cf7dc1066647b1ad2b324/common/gpu/nvidia/prime.nix
  5. https://www.reddit.com/r/Fedora/comments/x487g1/how_to_force_waylandgnomeshell_to_use_intel_igpu/
  6. Gnome 43: gnome-shell forces itself onto dGPU instead of using iGPU in PRIME Render Offloading setups using Wayland (#2969) · Issues · GNOME / mutter · GitLab
  7. https://www.reddit.com/r/Fedora/comments/z29edh/deleted_by_user/
  8. Gnome 43: gnome-shell forces itself onto dGPU instead of using iGPU in PRIME Render Offloading setups using Wayland (#6146) · Issues · GNOME / gnome-shell · GitLab

Not sure if this is the best solution but apparently some people disabled soe odules here to do that Fully disabling the Nvidia dGPU on an Optimus Laptop - #8 by matklad

Doesn’t this make the Nvidia GPU completely unavailable and thus not suitable for offloading?

Oh right. If you want to use offloading, there is the option:

# Enable offload
hardware.nvidia.prime.offload.enable = true;
# Goes to sleep mode if no application uses
# the card as it enables notably NVreg_DynamicPowerManagement=0x02
# described in https://download.nvidia.com/XFree86/Linux-x86_64/435.17/README/primerenderoffload.html
hardware.nvidia.powerManagement.finegrained = true;

that you can set. There are other options that you can set, see the list here for instance, including finegrained, but as far as I understand you first need to enable offload. In case of doubt, the source here can confirm what an option is doing, refering to this documentation for further details.

I’m not sure about how gnome implements the wayland protocol, but I know I had the same issue under Hyprland, even with hardware.nvidia.powerManagement.finegrained = true, that is, nivida-smi was still showing a running process (Hyprland itself in my case) and thereby preventing the device from fully powering off.

The solution in my case (which will not work for gnome) was to explicitly set WLR_DRM_DEVICES which is a variable used by wlroot’s based compositors containing a list of drm devices to consider on startup. If the nvidia card is in the list, even if it is not the first entry and the thus the one initialized and used to present, it still seems to associate the process with the GPU (probably opens a display context in the kernel or something).

If you don’t set the variable at all wlroots based compositors just assume to consider all available cards, so the solution was set it explicitly and only specify the intel card. Nvidia offload still works and the card will still spin up if asked to do so, but otherwise there are no processes associate with it, and the finegrained pm will automatically put it to sleep when it can.

I’m sure your issue is that Gnome is doing something similar under the hood, but I’m not sure if they provide an interface to override it somehow, like wlroots does.

That seems like a good suggestion. I’ve found this MR on Gnome Gitlab which seems to do just what you said. I have added it to my udev rules with the following:

{ config, lib, pkgs, ... }:

let
  gnome-gpu-rule = pkgs.writeTextFile {
    name = "61-mutter-primary-gpu.rules";
    text = ''
      ENV{DEVNAME}=="/dev/dri/card1", TAG+="mutter-device-preferred-primary"
    '';
    destination = "/etc/udev/rules.d/61-mutter-primary-gpu.rules";
  };
in {
  imports =
    [ # Include the results of the hardware scan.
      ./hardware-configuration.nix
    ];
  ### rest of configuration.nix...
  services.udev.packages = [ gnome-gpu-rule ];
  ### rest of configuration.nix...
}

When I set it to card1 it runs fine, but on the Nvidia GPU. When I set to card0 it won’t launch Gnome, and when selecting tty2 by doing ctrl+alt+f2, and then running gnome-shell I get something like unsupported session type.

I suppose this might be because of services.xserver.videoDrivers = ["nvidia"];? Should I set this to services.xserver.videoDrivers = ["modesetting"];? The issue with that is that nvidia-smi and nvidia-offload aren’t available in that case…

Just adding some information (for everybody’s reference):

Checking whether you need card0 or card1 can be done with the following command:

$ ls -l /sys/class/drm/card*/device/driver
lrwxrwxrwx 1 root root 0  4 jan 17:55 /sys/class/drm/card0/device/driver -> ../../../../bus/pci/drivers/nvidia
lrwxrwxrwx 1 root root 0  4 jan 17:55 /sys/class/drm/card1/device/driver -> ../../../bus/pci/drivers/i915

Logs can be gathered without a desktop by running journalctl -b 0 (-b 0 for everything since last boot). I have found the following:

  1. With the udev rule set to the iGPU:
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Running GNOME Shell (using mutter 45.2) as a Wayland display server
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Failed to make thread 'KMS thread' realtime scheduled: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Name "org.freedesktop.RealtimeKit1" does not exist
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Device '/dev/dri/card0' prefers shadow buffer
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Added device '/dev/dri/card0' (nvidia-drm) using atomic mode setting.
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Device '/dev/dri/card1' prefers shadow buffer
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Added device '/dev/dri/card1' (i915) using atomic mode setting.
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Created gbm renderer for '/dev/dri/card0'
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: Created gbm renderer for '/dev/dri/card1'
jan 04 17:52:09 nixos .gnome-shell-wr[1872]: GPU /dev/dri/card1 selected primary given udev rule
  1. and without the udev rule:
jan 04 17:26:27 nixos .gnome-shell-wr[1954]: Running GNOME Shell (using mutter 45.2) as a Wayland display server
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Failed to make thread 'KMS thread' realtime scheduled: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Name "org.freedesktop.RealtimeKit1" does not exist
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Device '/dev/dri/card0' prefers shadow buffer
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Added device '/dev/dri/card0' (nvidia-drm) using atomic mode setting.
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Device '/dev/dri/card1' prefers shadow buffer
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Added device '/dev/dri/card1' (i915) using atomic mode setting.
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Created gbm renderer for '/dev/dri/card0'
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Created gbm renderer for '/dev/dri/card1'
jan 04 17:26:28 nixos .gnome-shell-wr[1954]: Boot VGA GPU /dev/dri/card1 selected as primary

(man I love NixOS, I never would’ve tried any of these things without the declarative way of setting up my machine)

So it seems that the Intel iGPU is selected by Gnome as its primary GPU, but nvidia-smi still reports gnome-shell running on the Nvidia card and thus preventing it from sleeping. Killing the process by PID reported by nvidia-smi confirms that it is the main gnome process.

(logs too big to post here unfortunately)

Solved, read through Gnome 43: gnome-shell forces itself onto dGPU instead of using iGPU in PRIME Render Offloading setups using Wayland (#6146) · Issues · GNOME / gnome-shell · GitLab again, turns out gnome-remote-desktop is the culprit somehow. Stopping it with systemctl --user disable --now gnome-remote-desktop.service immediately makes the GPU go to D3Cold and powertop shows the lower power draw.

Man this was a weird one. I’ll figure out how to make this permanent in NixOS later. Thanks for the replies and reading everybody!

2 Likes