Wifi failing since recent NixOS update, looking for debug or bisect help

I am using a flake to manage my NixOS system. The code is here and it contains the autoupgrade setting to fetch a new version of nixpkgs (nixos-unstable branch):

{
  system.autoUpgrade = {
    enable = true;
    flags = [
      "--commit-lock-file"
    ] ++ pkgs.lib.lists.concatMap (i: ["--update-input" i]) (builtins.attrNames self.inputs);
    flake = "/home/luc/src/sys";
  };
}

Currently I roll back every update and try to stay on the nixpkgs commit ff377a78794d412a35245e05428c8f95fef3951f because some time after this my wifi breaks.

Now I have two questions:

1. How to debug the failing wifi?

My logs say that my wifi card is Broadcom BCM43b1. When I do sudo journalctl -b | grep -i bcm I get these lines on a good generation:

Feb 11 11:14:25 yoga kernel: usb 1-4: Product: BCM20702A0
Feb 11 11:14:25 yoga kernel: eth0: Broadcom BCM43b1 802.11 Hybrid Wireless Controller 6.30.223.271 (r587334)
Feb 11 11:14:26 yoga kernel: Bluetooth: hci0: BCM: chip id 63
Feb 11 11:14:26 yoga kernel: Bluetooth: hci0: BCM: features 0x07
Feb 11 11:14:26 yoga kernel: Bluetooth: hci0: BCM20702A
Feb 11 11:14:26 yoga kernel: Bluetooth: hci0: BCM20702A1 (001.002.014) build 0000
Feb 11 11:14:26 yoga kernel: Bluetooth: hci0: BCM: firmware Patch file not found, tried:
Feb 11 11:14:26 yoga kernel: Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0489-e07a.hcd'
Feb 11 11:14:26 yoga kernel: Bluetooth: hci0: BCM: 'brcm/BCM-0489-e07a.hcd'
Feb 11 11:14:27 yoga kernel: Modules linked in: cmdlinepart intel_spi_platform intel_spi spi_nor ip6_tables joydev hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_incl_3d xt_conntrack nf_conntrack msr hid_sensor_accel_3d nf_defrag_ipv6 hid_sensor_rotation 8021q nf_defrag_ipv4 hid_sensor_als mtd hid_sensor_trigger iTCO_wdt industrialio_triggered_buffer intel_pmc_bxt kfifo_buf hid_multitouch watchdog hid_sensor_iio_common hid_rmi mei_hdcp industrialio ip6t_rpfilter rmi_core ipt_rpfilter hid_sensor_custom i915 intel_rapl_msr xt_pkttype nf_log_ipv6 nf_log_ipv4 wmi_bmof nf_log_common xt_LOG xt_tcpudp nft_compat nft_counter snd_hda_codec_realtek sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_intel_dspcfg uvcvideo coretemp nls_iso8859_1 crc32_pclmul snd_hda_codec btusb nls_cp437 ghash_clmulni_intel nf_tables btrtl vfat btbcm rapl btintel fat cec videobuf2_vmalloc videobuf2_memops nfnetlink intel_cstate videobuf2_v4l2 sch_fq_codel videobuf2_common deflate
Feb 11 11:14:27 yoga kernel:  drm_kms_helper snd_hda_core bluetooth intel_uncore i2c_i801 videodev snd_hwdep serio_raw intel_gtt efi_pstore mei_me intel_pch_thermal i2c_smbus bcma snd_pcm mc hid_sensor_hub evdev processor_thermal_device ecdh_generic snd_timer mousedev ecc crc16 mac_hid i2c_algo_bit intel_rapl_common lpc_ich fb_sys_fops syscopyarea i2c_hid sysfillrect int340x_thermal_zone mei loop snd sysimgblt intel_soc_dts_iosf cpufreq_powersave thermal fan soundcore ideapad_laptop tun tap sparse_keymap tiny_power_button i2c_designware_platform macvlan wmi bridge battery soc_button_array pinctrl_lynxpoint stp int3400_thermal llc dw_dmac i2c_designware_core acpi_thermal_rel dw_dmac_core button acpi_pad video ac wl(PO) cfg80211 rfkill kvm_intel kvm drm irqbypass agpgart pstore backlight fuse i2c_core configfs efivarfs ip_tables x_tables autofs4 xfs libcrc32c crc32c_generic dm_crypt cbc encrypted_keys trusted tpm rng_core hid_generic usbhid hid sd_mod t10_pi crc_t10dif crct10dif_generic input_leds led_class
Feb 11 11:14:27 yoga systemd[1]: Found device BCM4352 802.11ac Wireless Network Adapter.

and these on a bad generation, where I do not have wifi connection:

Feb 11 10:59:04 yoga kernel: usb 1-4: Product: BCM20702A0
Feb 11 10:59:04 yoga kernel: eth0: Broadcom BCM43b1 802.11 Hybrid Wireless Controller 6.30.223.271 (r587334)
Feb 11 10:59:04 yoga kernel: Bluetooth: hci0: BCM: chip id 63
Feb 11 10:59:04 yoga kernel: Bluetooth: hci0: BCM: features 0x07
Feb 11 10:59:04 yoga kernel: Bluetooth: hci0: BCM20702A
Feb 11 10:59:04 yoga kernel: Bluetooth: hci0: BCM20702A1 (001.002.014) build 0000
Feb 11 10:59:04 yoga kernel: Bluetooth: hci0: BCM: firmware Patch file not found, tried:
Feb 11 10:59:04 yoga kernel: Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0489-e07a.hcd'
Feb 11 10:59:04 yoga kernel: Bluetooth: hci0: BCM: 'brcm/BCM-0489-e07a.hcd'
Feb 11 10:59:04 yoga systemd[1]: Found device BCM4352 802.11ac Wireless Network Adapter.
Feb 11 10:59:07 yoga kernel:  crc32_pclmul wmi_bmof nfnetlink ghash_clmulni_intel sch_fq_codel nls_iso8859_1 videobuf2_common btbcm nls_cp437 btintel vfat fat snd_compress evdev rapl ac97_bus deflate snd_pcm_dmaengine snd_pcm intel_cstate mac_hid bluetooth drm_kms_helper snd_timer mei_me efi_pstore videodev intel_pch_thermal intel_uncore loop cpufreq_powersave ecdh_generic tun serio_raw ecc crc16 tap intel_gtt macvlan processor_thermal_device i2c_i801 bcma ideapad_laptop intel_rapl_common i2c_algo_bit fb_sys_fops syscopyarea int340x_thermal_zone hid_sensor_hub mc snd bridge i2c_hid stp llc lpc_ich soundcore i2c_smbus intel_soc_dts_iosf sysfillrect sparse_keymap sysimgblt mei thermal wl(PO) wmi fan battery i2c_designware_platform soc_button_array dw_dmac i2c_designware_core dw_dmac_core pinctrl_lynxpoint video cfg80211 tiny_power_button int3400_thermal acpi_pad ac acpi_thermal_rel button rfkill kvm drm irqbypass agpgart fuse pstore backlight i2c_core configfs efivarfs ip_tables x_tables autofs4 xfs

The main difference between these seems to be the line about “Modules linked in” and if I extract these modules from the logs the diff from good (left) to bad (right) is

--- mod-good	2022-02-11 11:32:23.997645101 +0100
+++ mod-bad	2022-02-11 11:32:30.435807322 +0100
@@ -6 +5,0 @@
-8021q
@@ -7,0 +7 @@
+ac97_bus
@@ -58 +57,0 @@
-fat
@@ -130 +128,0 @@
-msr
@@ -142,2 +139,0 @@
-nls_cp437
-nls_iso8859_1
@@ -157,0 +154 @@
+snd_compress
@@ -159,0 +157 @@
+snd_hda_codec_hdmi
@@ -165,0 +164,2 @@
+snd_pcm_dmaengine
+snd_soc_core
@@ -168,0 +169,4 @@
+soundwire_bus
+soundwire_cadence
+soundwire_generic_allocation
+soundwire_intel
@@ -172 +175,0 @@
-sunrpc
@@ -187 +189,0 @@
-vfat

And additionally I see this message repeated every second in the journal of my wpa_supplicant service:

Feb 07 17:10:05 yoga wpa_supplicant[70319]: wlp1s0: CTRL-EVENT-SCAN-FAILED ret=-22 retry=1

My network config looks like this:

boot.extraModulePackages = [ config.boot.kernelPackages.broadcom_sta ];
networking.wireless.enable = true;
networking.wireless.interfaces = [ "wlp1s0" ];
networking.resolvconf.dnsExtensionMechanism = false;  # because my home router is stupid
networking.useDHCP = false;
networking.interfaces.wlp1s0.useDHCP = true;

2. How to efficently bisect nixpkgs

I was thinking about bisecting nixpkgs and find the change that broke my setup (by recompiling my system against the different versions of nixpkgs).

I have a clone of the nixpkgs git repository locally where I can git bisect in order to select commits and then I can do nixos-rebuild build --overwrite-input nixpkgs nixpkgs/$COMMIT_HASH. The problem with this is that not all commits are build by Hydra and can be found on cache.nixos.org. So depending on the commit I select I might have to build a lot of stuff (including the linux kernel) and that takes to long for me to bear with.

How can I bisect nixpkgs and only select commits that are build by hydra?

For the last interval, where no more Hydra builds are available, I can then hopefully review the commits manually or build some of them locally.

EDIT TLDR:

https://channels.nix.gsc.io/
https://channels.nix.gsc.io/nixos-unstable/history

just a quick side note, do you have to run unstable, and auto upgrade so aggressively, however if your want a rolling release this is something you have to bear.

As Nix is one of the only systems that allows you to mix stable and unstable packages, its good practice to run your system on stable, and bring in a few things in as unstable.

Thus avoiding breakage like this… however, I appreciate your intrepid testing of the tip of the spear unstable! :slight_smile: ,looks like you having ‘fun’.

I’m not sure how to get all the nixpkgs commits that were successful cache by hydra, but in your case a remote builder, either from https://nixbuild.net/ or building or renting your own could prevent your yoga pad from internally melting and have a shorter life span.

We use AMD 5950X here, currently as of writing Feb 2022 the best bang for you buck, in single threaded performance and number of cores for concurrent builds. Some motherboards also accept ECC ram for extra peace of mind. I can post the complete build specs if your interested.

I will research extracting the commits from the channel, so you can just bisect the stuff that hydra built, and pushed to channels.

A number of nixos tests have to succeed for this to happen, and currently there is no notion of hardware tests… it could be done however!!! :slight_smile:

EDIT: i knew i had seen it.

https://channels.nix.gsc.io/
https://channels.nix.gsc.io/nixos-unstable/history

1 Like

Thank you @nixinator that list of commits already helps a lot. I reduced my system closure and was able to narrow it down to these two commits:

  • d5dae6569ea9952f1ae4e727946d93a71c507821 works
  • 689b76bcf36055afdeb2e9852f5ecdd2bf483f87 does not work

Sadly git says that’s still 1500 commits and 1000 changed files. So I will have to do some more building unless somebody else has a hunch what is going on.

Can you explain this a little more? I would love to have some test for this especially for bisecting automatically.

its a bit of a ‘imagineering’ so doesn’t exist (yet), and it doesn’t help your immediate problems… but

Hardware testing, would need a hardware test bed, and the mean real live hardware, not VM’s (which you can do a lot with), but there is nothing like the real thing.

@lucc : How did you fix this issue? I am having the same problem: upgrading from nixos 21.11 to any later nixos version breaks wifi with the same symptoms as yours.

(It does not seem to be a kernel problem, it fails with the same 5.10 kernel when used on newer nixos versions. Also there is some permission problem, iwlist scan works as root but not as ordinary user on newer nixos systems.)

1 Like

The answer is quite sad: I bought a new wifi chip :frowning:

(Intel Dual Band-AC7260 Model7260NGW for the record, works fine so far)

@tomberek , we really need to do a ‘real’ hardware test bed , so we can make certified nixos hardware.