Networkd + iwd = upgrades knock machines offline

Hi,

This has bit me again, in a situation where I’m having to call in favor for remote hands-on recovery. I’m posting here to try to get some extra attention on this. I suspect some systemd/networkd guru will know a better solution here.

https://github.com/NixOS/nixpkgs/issues/195777

As you can imagine, this is very problematic for NixOS hosts that are only reachable via wifi.

Currently, my laptop is “tethered” to my home network via my phone and a USB-Ethernet connection, due to this bug and yesterday’s major update, and the fact that I can’t afford to restart sway or my machine right now.

AFAIU both networkd and iwd claim to rename and configure wifi devices. That usually confuses one of them and fail at setting up the disappeared interface.

I personally disabled wifi interface renaming on iwd and rely on networkd as

  networking.wireless.iwd = {
    enable = true;
    settings = {
      General = {
        # systemd-networkd renames ain interface for us
        UseDefaultInterface = true;
      };
    };
  };

  # systemd-networkd and it's config
  systemd.network.enable = true;
  systemd.network.links."10-wl0" = {
    matchConfig.MACAddress = "aa:bb:cc:dd:ee:ff";
    linkConfig.Name = "wl0";
  };

Tried the fix in the linked ticket? Usually sudo systemctl restart systemd-udev-trigger fixes this for me without requiring to restart anything. Obviously requires local access.

Hey @polygon, I’m probably the one that suggested that workaround :laughing: .

Unfortunately, that workaround triggers (I think) some sort of GPU change that causes Sway to hang since some months ago. So, I either have to restart my computer, or at the very least, Sway (which are usually the same amount of annoying.)

@trofi those words sound relevant. I’m going to give your suggestion a shot.

Separately, if anyone knows a way to manually trigger this to test workarounds, I’m all ears. Otherwise I have to make this change with a comment to “revisit if this doesn’t break for 6 months”, or whatnot.

:pray: thanks all.

What @trofi said makes sense to me.

I do the opposite: let iwd rename the interfaces and have networkd configure the interfaces based on standard naming. Eg, by default the kernel names my wireless interface something weird, iwd renames it to wlan0, then networkd configures wlan0.

If this naming is the culprit - It certainly feels like at the very least there’s some sort of ordering that differs between boot and a switched generation. Otherwise I’d expect my system to get similarly jammed up on boot, no? I don’t really know the stack well enough to understand how udev, systemd, networkd, iwd, all have turns at the interface (since I think technically some of this renaming is from udev).

Hey folks,

To restate the issue: I took another large channel upgrade today and lost management of my wireless interface again.

I’ve hit this issue again, am feeling somewhat stubborn about sorting it once and for all.

Right now I’m USB Tethered to my phone, so I’m “unblocked” and I don’t need to be mobile for sometime, so I can coast in this state while I investigate.

I’d be incredibly grateful for any suggestions - even just what to do to try to determine if there’s a race or rename issue going on.

I’ve also encountered this issue, and have noticed it also knocks out my Bluetooth (my Intel WiFi device is also my Bluetooth device) which leads me to suspect it’s drivers/udev/something else suitably low-level.