Random lockups/freezes throughout the day during non-intensive tasks

Over the past week my Framework 13 laptop running NixOS has been freezing periodically (say ~2 times a day with consistent use throughout the day). The screen freezes on the current frame and none of the controls work. My caps lock key blinks, which I learned may indicate kernel panics. I need to hold down the power button for several seconds to reboot.

The lockups occur generally when i’m doing fairly light weight tasks (browsing the internet in firefox). Fans not running, etc. I’ve done a few compiles with the fans at full blast and no lock up. I’m now on NixOS 23.11 but this repro’d earlier in the week before I upgraded. I’m not positive which generation this started in, and it’s hard to test each one since the issue repros infrequently.

I dual boot Windows and ran the memory corruption tool, nothing reported. I’ll likely do a longer mem test tomorrow. I have not observed this behavior in Windows yet, but admittedly have not spent much time in that OS, just a few hours to see if the issue would hit (doing the same web browsing tasks).

Before the latest freeze I added boot.crashDump.enable = true; to my config, and I was able to see this error at the end of the a sudo journalctl -k --boot=<last_boot>

Dec 27 18:07:08 obsidian kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 21510 jiffies s: 4349 root: 0x1/.
Dec 27 18:07:08 obsidian kernel: rcu: blocking rcu_node structures (internal RCU debug):
Dec 27 18:07:08 obsidian kernel: Sending NMI from CPU 4 to CPUs 0:
Dec 27 18:07:08 obsidian kernel: NMI backtrace for cpu 0
Dec 27 18:07:08 obsidian kernel: CPU: 0 PID: 1131 Comm: irq/184-iwlwifi Kdump: loaded Tainted: G     U     O       6.1.69 #1-NixOS
Dec 27 18:07:08 obsidian kernel: Hardware name: Framework Laptop (13th Gen Intel Core)/FRANMCCP06, BIOS 03.04 05/24/2023
Dec 27 18:07:08 obsidian kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x24f/0x2a0
Dec 27 18:07:08 obsidian kernel: Code: c1 ea 12 83 e0 03 83 ea 01 48 c1 e0 05 48 63 d2 48 05 80 27 03 00 48 03 04 d5 00 3b 75 8c 48 89 28 8b 45 08 85 c0 75 09 f3 90 <8b> 45 08 85 c0 74 f7 48 8b 55 00 48 85 d2 74 8f 0f 0d >
Dec 27 18:07:08 obsidian kernel: RSP: 0000:ffffafb240003c20 EFLAGS: 00000246
Dec 27 18:07:08 obsidian kernel: RAX: 0000000000000000 RBX: ffffa371118554a0 RCX: 0000000000040000
Dec 27 18:07:08 obsidian kernel: RDX: 00000000000000ab RSI: 0000000002b190c8 RDI: ffffa371118554a0
Dec 27 18:07:08 obsidian kernel: RBP: ffffa3749f832780 R08: 0000000000000000 R09: ffffffffc1790230
Dec 27 18:07:08 obsidian kernel: R10: ffffafb240003da8 R11: 0000000000000000 R12: 0000000000000000
Dec 27 18:07:08 obsidian kernel: R13: 0000000000000000 R14: ffffa37111855480 R15: ffffa37119898028
Dec 27 18:07:08 obsidian kernel: FS:  0000000000000000(0000) GS:ffffa3749f800000(0000) knlGS:0000000000000000
Dec 27 18:07:08 obsidian kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 27 18:07:08 obsidian kernel: CR2: 00007f4787826000 CR3: 00000003f8e10000 CR4: 0000000000750ef0
Dec 27 18:07:08 obsidian kernel: PKRU: 55555554
Dec 27 18:07:08 obsidian kernel: Call Trace:
Dec 27 18:07:08 obsidian kernel:  <NMI>
Dec 27 18:07:08 obsidian kernel:  ? nmi_cpu_backtrace.cold+0x1c/0x79
Dec 27 18:07:08 obsidian kernel:  ? nmi_cpu_backtrace_handler+0xd/0x20
Dec 27 18:07:08 obsidian kernel:  ? nmi_handle+0x5a/0x120
Dec 27 18:07:08 obsidian kernel:  ? default_do_nmi+0x40/0x130
Dec 27 18:07:08 obsidian kernel:  ? exc_nmi+0x132/0x170
Dec 27 18:07:08 obsidian kernel:  ? end_repeat_nmi+0x16/0x67
Dec 27 18:07:08 obsidian kernel:  ? iwl_txq_progress+0x50/0x50 [iwlwifi]
Dec 27 18:07:08 obsidian kernel:  ? native_queued_spin_lock_slowpath+0x24f/0x2a0
Dec 27 18:07:08 obsidian kernel:  ? native_queued_spin_lock_slowpath+0x24f/0x2a0
Dec 27 18:07:08 obsidian kernel:  ? native_queued_spin_lock_slowpath+0x24f/0x2a0
Dec 27 18:07:08 obsidian kernel:  </NMI>
Dec 27 18:07:08 obsidian kernel:  <IRQ>
Dec 27 18:07:08 obsidian kernel:  _raw_spin_lock_bh+0x29/0x30
Dec 27 18:07:08 obsidian kernel:  iwl_txq_reclaim+0x83/0x5d0 [iwlwifi]
Dec 27 18:07:08 obsidian kernel:  ? iwl_mvm_tx_reclaim+0x374/0x4f0 [iwlmvm]
Dec 27 18:07:08 obsidian kernel:  ? iwl_mvm_rx_tx_cmd+0x21c/0x930 [iwlmvm]
Dec 27 18:07:08 obsidian kernel:  iwl_mvm_rx_tx_cmd+0x21c/0x930 [iwlmvm]
Dec 27 18:07:08 obsidian kernel:  ? iwl_mvm_rx_ba_notif+0x338/0x360 [iwlmvm]
Dec 27 18:07:08 obsidian kernel:  ? iwl_mvm_rx_common+0x10f/0x300 [iwlmvm]
Dec 27 18:07:08 obsidian kernel:  iwl_mvm_rx_common+0x10f/0x300 [iwlmvm]
Dec 27 18:07:08 obsidian kernel:  iwl_pcie_rx_handle+0x3ce/0xaa0 [iwlwifi]
Dec 27 18:07:08 obsidian kernel:  iwl_pcie_napi_poll_msix+0x2a/0xc0 [iwlwifi]
Dec 27 18:07:08 obsidian kernel:  __napi_poll+0x28/0x160
Dec 27 18:07:08 obsidian kernel:  net_rx_action+0x29e/0x350
Dec 27 18:07:08 obsidian kernel:  __do_softirq+0xc3/0x2ab
Dec 27 18:07:08 obsidian kernel:  ? disable_irq_nosync+0x10/0x10
Dec 27 18:07:08 obsidian kernel:  do_softirq.part.0+0x5f/0x80
Dec 27 18:07:08 obsidian kernel:  </IRQ>
Dec 27 18:07:08 obsidian kernel:  <TASK>
Dec 27 18:07:08 obsidian kernel:  __local_bh_enable_ip+0x64/0x70
Dec 27 18:07:08 obsidian kernel:  iwl_pcie_irq_rx_msix_handler+0xc5/0x180 [iwlwifi]
Dec 27 18:07:08 obsidian kernel:  irq_thread_fn+0x1c/0x60
Dec 27 18:07:08 obsidian kernel:  irq_thread+0xf7/0x1c0
Dec 27 18:07:08 obsidian kernel:  ? irq_thread_fn+0x60/0x60
Dec 27 18:07:08 obsidian kernel:  ? irq_thread_check_affinity+0xc0/0xc0
Dec 27 18:07:08 obsidian kernel:  kthread+0xd7/0x100
Dec 27 18:07:08 obsidian kernel:  ? kthread_complete_and_exit+0x20/0x20
Dec 27 18:07:08 obsidian kernel:  ret_from_fork+0x1f/0x30
Dec 27 18:07:08 obsidian kernel:  </TASK>

Full kernel logs from last boot: panic_4_kernel.log · GitHub

Here is my config: https://github.com/joshspicer/nixos-config/blob/180bd6848c032e27b0040724068e764a4aeb6316/hosts/obsidian/configuration.nix

Any idea how to proceed? Happy to run tests, etc to help resolve this issue. Thanks!

Some more system info:

lscpu

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               GenuineIntel
  Model name:            13th Gen Intel(R) Core(TM) i7-1360P
    CPU family:          6
    Model:               186
    Thread(s) per core:  2
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    CPU(s) scaling MHz:  13%
    CPU max MHz:         5000.0000
    CPU min MHz:         400.0000
    BogoMIPS:            5222.40
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good 
                         nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
                          xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
                          rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi umip pku ospke waitpkg gfni va
                         es vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr ibt flush_l1d arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   448 KiB (12 instances)
  L1i:                   640 KiB (12 instances)
  L2:                    9 MiB (6 instances)
  L3:                    18 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-15
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected
sudo dmidecode -t bios

Place your right index finger on the fingerprint reader
# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 3.4 present.

Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
        Vendor: INSYDE Corp.
        Version: 03.04
        Release Date: 05/24/2023
        Address: 0xE0000
        Runtime Size: 128 kB
        ROM Size: 16 MB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 3.4

Handle 0x0015, DMI type 13, 22 bytes
BIOS Language Information
        Language Description Format: Long
        Installable Languages: 4
                en|US|iso8859-1,0
                fr|FR|iso8859-1,0
                zh|TW|unicode,0
                ja|JP|unicode,0
        Currently Installed Language: en|US|iso8859-1,0
cat /proc/version 

Linux version 6.1.69 (nixbld@localhost) (gcc (GCC) 12.3.0, GNU ld (GNU Binutils) 2.40) #1-NixOS SMP PREEMPT_DYNAMIC Wed Dec 20 16:00:29 UTC 2023

I was hitting this problem the other day, pretty sure it was the kernel that had a bug in my case, so I downgraded and things seem better again (knock on wood)

Couple of things for you to try:

  • try with a different kernel, like latest (6.7.0)
   # Latest Kernel Packages
   boot.kernelPackages = pkgs.linuxPackages_latest;
  • upgrade to new stable which is now 23.11 (change system version from 23.05, and use nix-channel)