My gnome experience under nixos

I apologize in advance, because this post will not be very productive. It’s basically a list of a lot of issues I encountered which are all pretty hard to debug and when I try to debug one I normally stumble into the next one. Maybe the right thing would be opening one github issue per bug, but I couldn‘t convince myself yet, because those bugs sound like they are really tough to pinpoint. I guess most of these are upstream problems but I can‘t really tell.
I realize writing and maintaining gnome is a lot of work and I appreciate everything that is being done for it. I really do. Still, my gnome drives my crazy quite often. The bugs listed here were all encountered within a 3h session, but most of them are not new to me. This is on nixos-unstable but I had most of these issues on 20.09, too.

  • gnome often hangs, that’s okayish, performance is probably hard. (I have a Thinkpad T480s which is actually not that old, but I can stand a few lags.)
  • when those hangs happen, the key I am currently pressing get’s repeated a lot, when I try to fix the resulting typo by pressing backspace often the next hang repeats the backspace, deleting everything I wrote. (That can be infuriating, I think it got a bit rarer when I switched to unstable. Improvement, yay!)
  • pam-auto-keyring-unlock fails in roughly 3 out of 4 times when I (re-)login: It just prompts for the password of my default keyring although it shouldn‘t. The odds get better, when I wait 15s before reloging.
  • When I unlock the keyring manually after the fail, the keyring does not work (i.e. respond) as a ssh-agent as it usually does.
  • When I disable auto-keyring-unlock (via forcing the nixos option), my gnome-keyring hangs with 100% cpu consumption on start.
  • I have done something (probably hard to reproduce, but I really don‘t think I have done anything far out of the ordinary) with my home-manager that closing the shell in which I ran home-manager switch causes gnome-shell to crash. (This one is actually a bit funny.)
  • Normally after reloging some of the session-variables home-manager normally sets (and which are successfully set on first login) don‘t get set or overriden by systemd defaults.
  • gnome-calendar crashes when I scroll to next january.
  • Sometimes gdm hangs and freezes.
  • Sometimes gdm login hangs after entering and confirming my password.
  • Sometimes after relogin I don‘t have a running user systemd session.
  • Sometimes the logout button does not have any effect.
  • Sometimes the gnome-shell just crashes right after login and I get logged out again.

Maybe other people can share if they have encountered at least some of those issues and maybe have some tips for me? Or just mention that their gnome (and especially gnome-keyring-unlock) runs just great?

I don‘t want to give up on gnome, because I think the UI decisions are best in class. And I am always really happy when something works out of the box. (Like e.g. just click here in the options to map your wacom tablet input to a monitor of your choice. Imagine that working out of the box on a linux system 10 years ago … Or entering my nextcloud credentials and then having my calendars synchronized. Most of my i3/sway friends just don‘t have working calendars on their system …[I am sure there are solutions for that but in my sample most of them don‘t have them set-up.])

GNOME runs great for me, I don’t have any issues (neither with proprietary NVIDIA drivers on X11, nor amdgpu on Wayland).

When there are hangs, are there any useful messages in the logs (per journalctl)?

2 Likes

Gnome runs fine for me on an older ThinkPad (T460) with Intel graphics.

What does your resource usage look like when it hangs, or even when it’s not hanging?

My laptop was manufactured in 2019 I think. Have never really had lag on recent GNOME. I think an ssd has a huge thing to do with this.

hmm, sounds like keyboard latency? I’ve seen some data showing that from device to device it can be MUCH worse than others. Though considering your lags it’s probably not related to keyboard latency.

Never had those, actually found keyring to work very reliably.

:+1: I get crashes too with GNOME calendar in 3.38. Those need to be tracked upstream and fixed because it’s not very usable on unstable currently. Perhaps that’s Crash in gtk_image_reset() (#299) · Issues · GNOME / gnome-calendar · GitLab. Anyways, this one could very easily be tracked on github as GNOME Core Apps Usability 3.38.

Interesting. GDM works perfectly for me. Looking at the change log there was some notable fixes NEWS · gnome-3-38 · GNOME / gdm · GitLab.

How scary :frowning:

I wonder if this is related to gnome3 logout button doesn't show · Issue #100108 · NixOS/nixpkgs · GitHub. I think when I had that bug something like this would also happen. But it could also be some accountsservice thing or it’s inhibited somehow on your system from time to time.

Thank you for sharing these. I also think nixos-version is pertinent to this nixos-unstable run since a lot can change there. Overall seeing this in a 3h session is… interesting to say the least because a lot I haven’t seen first hand on several machines.

1 Like

Thank you everyone for your replies!

When there are hangs, are there any useful messages in the logs (per journalctl )?

A while ago I saw warnings from libinput, that it had detected too much time had passed or something similar. I will look out for them the next time I have a hang.

What does your resource usage look like when it hangs, or even when it’s not hanging?

I normally get the hangs when the system is under a bit of load but CPU and memory are not nearly maxed out. Although things get rapidly worse when memory is maxed out.

I think an ssd has a huge thing to do with this.

I have a fast ssd.

hmm, sounds like keyboard latency? I’ve seen some data showing that from device to device it can be MUCH worse than others. Though considering your lags it’s probably not related to keyboard latency.

I am not sure. But I have never witnessed anything like it under sway. So I don‘t think it’s just the keyboard.

Never had those, actually found keyring to work very reliably.

I envy you. But at least you get gnome-keyring-daemon[28769]: Unsupported or unknown SSH key algorithm: ssh-ed25519, too, right?

Interesting. GDM works perfectly for me. Looking at the change log there was some notable fixes […]

The newest fix looks promising. Let’s hope things get better.

I wonder if this is related to […]. I think when I had that bug something like this would also happen.

Well I had that issue, too. But it’s gone by now.

tbh, the last few ones I only threw in here, too underline my pain. I guess most of them are a result of me quickly logging in and out for debugging purposes. I think the not working logout is just caused by not having a systemd session.

So, debugging all of these issues will be a lot of work. There is an astonishingly large number of problems in my logs. (Most of them seem all quite independent.) Some are obviously related to my issues:

i.e.

systemd-coredump[13261]: Process 1646 (.gnome-keyring-) of user 1000 dumped core.

others are just funny

systemd-xdg-autostart-generator[12954]: Exec binary ‘/usr/bin/skypeforlinux’ does not exist: No such file or directory

Haha, skype, nice try sneaking yourself into my autostart.

But I just wanted to share this one story, where I decided to activate autoLogin today, after a reboot gdm logged my user in, pam-unlock failed (obviously) and then gnome-shell crashed and sent me into a black screen with a blinking cursor, which rythm makes me thing it tried to start gdm but failed in an infinite loop. The system didn‘t react to anything so I had to reboot and flee my fate by switching to tty1 quick enough before this happened again.
Another session today finished in a tty telling me my systemd (pid1) segfaulted. At this point I am considering bad karma as a diagnosis for my system …

1 Like

Well, I just captured a hang inflagranti:

Dez 20 15:51:17 apollo gnome-shell[13091]: libinput error: client bug: timer event14 keyboard: scheduled expiry is in the past (-1790ms), your system is too slow
Dez 20 15:51:17 apollo gnome-shell[13091]: libinput error: client bug: timer event14 keyboard: scheduled expiry is in the past (-1682ms), your system is too slow
Dez 20 15:51:17 apollo gnome-shell[13091]: libinput error: client bug: timer event14 keyboard: scheduled expiry is in the past (-1282ms), your system is too slow
Dez 20 15:51:17 apollo gnome-shell[13091]: libinput error: client bug: timer event14 keyboard: scheduled expiry is in the past (-442ms), your system is too slow
Dez 20 15:51:18 apollo gnome-shell[13091]: libinput error: event1  - AT Translated Set 2 keyboard: client bug: event processing lagging behind by 169ms, your system is too slow
Dez 20 15:51:18 apollo kernel: mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 341)
Dez 20 15:51:18 apollo kernel: mce: CPU6: Core temperature above threshold, cpu clock throttled (total events = 341)
Dez 20 15:51:18 apollo kernel: mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 864)
Dez 20 15:51:18 apollo kernel: mce: CPU6: Core temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU7: Package temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU0: Package temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU4: Package temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU5: Package temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU1: Package temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU2: Core temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU3: Package temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU2: Package temperature/speed normal
Dez 20 15:51:18 apollo kernel: mce: CPU6: Package temperature/speed normal

Sounds like you may be hitting a bug: GitHub - erpalma/throttled: Workaround for Intel throttling issues in Linux.

Seems it can even be enabled in NixOS with a simple option: NixOS Search

1 Like

Thank you @austin, first I was really happy about your find!
Then I realized that I have this automatically enabled since I imported the right profile from nixos-hardware.

Darn! Have you used the same system with another Linux distro?

I have a similar experience on NixOS with GNOME 45 in VMware VM but only in Xorg.

When I login in GNOME Xorg, the system does not allow to click on menus (it is like “visually” freezing), then if I press a Win keybutton in my keyboard it updates the screen view, then freezes again until I press again the Win key to invoke the App Menu.

It occurs in my VM only in GNOME Xorg. GNOME Wayland works well. It occurs only when you install NixOS with a NON-GNOME environment and then you use your configuration.nix or flake to swap to GNOME.