I’m hoping someone can help me, I’ve exhausted my problem solving capability and need fresh ideas to diagnose and ideally solve these issues. I’ve listed a variety of issues below that all started after the 24.05. Before you assume it’s NVIDIA drivers, please read all the issues and note that I’ve attempted to switch and reconfigure video drivers. I could, of course, be missing something.
Device
I’ve been running NixOS on my Razer 14" laptop for about two years and this is the first time there has ever been any issues I couldn’t resolve with basic changes to my Nix configuration.
OS: NixOS 24.05.20240719.0c53b6b (Uakari) x86_64
Host: Razer PI411
Kernel: 6.6.41
Uptime: 20 mins
Packages: 2276 (nix-system), 414 (nix-user)
Shell: fish 3.7.1
Resolution: 2560x1440
DE: GNOME 46.2 (Wayland)
Theme: shell-Teal-Dark [GTK2/3]
Icons: Adwaita [GTK2/3]
Terminal: tmux
CPU: AMD Ryzen 9 5900HX with Radeon Graphics (16) @ 4.890GHz
GPU: NVIDIA GeForce RTX 3080 Mobile / Max-Q 8GB/16GB
GPU: AMD ATI Radeon Vega Series / Radeon Vega Mobile Series
Memory: 3413MiB / 15395MiB
Issue
There is a range of issues and I’m not sure if they’re related or not. They all started after the upgrade to 24.05. Issues are as follows.
Intermittent freezing:
It’s very hard for me to find a pattern to when this happens. It may (or may not) be more likely when I switch between GNOME desktops. I’ve also left a terminal process running, walked away, and came back to a frozen screen.
When the computer freezes, I cannot access TTY, the mouse doesn’t move, keyboard doesn’t respond, and my only option is a forced shutdown.
I’ve combed through the journalctl logs for the time around when this happens and cannot find anything out of the ordinary. The logs just stop when the freeze happens.
Cannot play local video on VLC:
When I attempt to play video’s on VLC there is audio and no video.
I’ve played with the video codex settings and everything there looks normal.
Random sleep and occasional problems waking from sleep
My favorite issue is when my computer just decides it’s time to sleep. I think this is most common if my completer has gone to sleep and then woken up successfully. If I’m on the same session I was on after the last start up, I don’t think the random sleeping is an issue.
I’ve also had the classic refusing to wake from sleep issue. Not much else to say on this one
Inability to Shutdown or Restart
This one is the most odd to me. When I attempt a restart or a shutdown from the GNOME desktop I’ll get the normal terminal screen with the kernel shutdown process but at some point it will hang. I have not seen a pattern to when it hangs but I’m not confident about that lack of pattern. When it hangs, my only option is a forced shutdown.
(probably not related) Orphaned pointers in storage drive
I’m pretty sure this is just a result of constantly having to do forced shut downs. At one point I couldn’t boot the system because there were issues with the file system. I had to mount a recovery drive and run fsck to repair the main partition.
Attempted Solutions
By this point you’re probably thinking NVIDIA drivers are the issue. That’s what I’ve assumed too. (Although, some of the issues around the terminal screen hanging on shutdown don’t seem to fit with my understanding of those issues). I’ve attempted to just go to bare basics with the config and I’ve switched between the production version of the NVIDIA driver and then back to stable and then back to production. Nothing I’ve tried seems to make a difference.
I’ve regenerated my `hardware_configuration.nix` file to make sure there aren’t version differences related to anything. No results here.
I’ve been stalking journalctl logs and have not come up with anything obvious. I don’t even know where to look anymore. I cannot see where this issue is originating from.
Absolutely. Doing that now. Good thoughts although I’d be shocked if I’m OOMing with 16G. It may take a bit to get an event to log. The issues not very reproducible.
Okay. Well. I tried to get the logs from dmesg -w by running dmesg -w > /Documents/logs.txt and checking it after a forced shutdown when I freeze. I’m not seeing anything notable. This could be because the file isn’t capturing things at the time of the freeze though…
I have started to see a consistent output when I try to shutdown or restart. I no longer have any successful shutdown or restart attempts from Gnome or the terminal. I’m not totally sure how to get this in text form but here is a picture. I’m still very very stumped on this one.
Those are the last few lines of a kernel panic, I’m pretty sure. Any chance your caps lock is flashing?
That’d be why you need to force it off, as well, and could very well cause file system issues.
This could be any number of things, of course. I’ve had a faulty (internal) keyboard wire cause random panics before. For now, share a full journalctl --boot -1, and maybe try down/upgrading the kernel.
Wonderful. I can work with that. I’ll try and pin an older kernel. Haven’t done that in NixOS yet but I’m sure I can figure something out. My caps lock is not flashing, no.
Here is a pastebin link for the journal logs on my last boot.
Also, I really really appreciate your help; lending me some of your knowledge. Thank you.
i’d still trypkgs.linux_6_9, or linux_6_8 failing that. i’m sure there are people running 6.10 on nvidia, but i don’t know what would be required for that
I got 6_8 to build, no issue (yet). I’ll play around with this and see if I’m still getting freezing or random sleeps. I’m actually surprised at how much easier its to pin a kernel in NixOS compared to Ubuntu based distros. I suppose I shouldn’t be surprised at these thing at this point.
I want to chime in here and say that after my latest nixos-unstable bump (9f4128e00b0ae8ec65918efeba59db998750ead6 2024-07-03), I have been experiencing relatively frequent crashes as well. Sometimes these happen as I’m using my system, and other times it happens after waking up from sleep.
And yes, my Caps Lock is flashing.
I will be updating within the next few days to see if it resolves the issue. I’m using boot.kernelPackages = pkgs.linuxPackages_latest; so the kernel should definitely bump from 6.9.7 to something newer.
Your issue is unlikely to be related to this one but try to revert the latest kernel bump and see if that fixes it. If yes, this is a kernel bug that was backported and the kernel devs would likely want to know about that.
I am pretty new to nixos I use it mostly for running some docker containers.
First time when I started it was up and running for almost 50+ days without any issues but from last few weeks it just freezes every 2-3 days. caps lock and num lock works but nothing else.
I also had to force reboot it every time this happens.
We’ll need way more context to help. Start a new thread, state your hardware, and share at minimum journalctl --boot -1 and dmesg (in code blocks) after rebooting after one of these freezes. It’d be extra helpful if you could pin down a specific nixpkgs commit, which you could do by bisecting.
But please start a different thread, this one’s ancient and almost certainly unrelated.