Rpi3: usb disk disconnected during Stage 2

Hello,

During early boot on my Raspberry Pi 3b+ (I guess during Stage 1), the attached USB disk is recognized by the kernel as /dev/sda:

Feb 15 20:22:23 nixos-rpi kernel: usb 1-1.1.3: new high-speed USB device number 6 using dwc2
Feb 15 20:22:23 nixos-rpi kernel: usb-storage 1-1.1.3:1.0: USB Mass Storage device detected
Feb 15 20:22:23 nixos-rpi kernel: scsi host0: usb-storage 1-1.1.3:1.0
Feb 15 20:22:23 nixos-rpi kernel: scsi 0:0:0:0: Direct-Access     TOSHIBA  External USB 3.0 5438 PQ: 0 ANSI: 6
Feb 15 20:22:23 nixos-rpi kernel: sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
Feb 15 20:22:23 nixos-rpi kernel: sd 0:0:0:0: [sda] 5860533164 512-byte logical blocks: (3.00 TB/2.73 TiB)
Feb 15 20:22:23 nixos-rpi kernel: sd 0:0:0:0: [sda] 4096-byte physical blocks
Feb 15 20:22:23 nixos-rpi kernel: sd 0:0:0:0: [sda] Write Protect is off
Feb 15 20:22:23 nixos-rpi kernel: sd 0:0:0:0: [sda] Mode Sense: 23 00 00 00
Feb 15 20:22:23 nixos-rpi kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb 15 20:22:23 nixos-rpi kernel:  sda: sda1 sda2 sda4
Feb 15 20:22:23 nixos-rpi kernel: sd 0:0:0:0: [sda] Attached SCSI disk

But a few seconds after entering Stage 2, the USB disk is stopped (I can hear the disk spin down with a mechanical noise):

Aug 05 18:24:10 nixos-rpi kernel: usb 1-1.1.3: USB disconnect, device number 6
Aug 05 18:24:10 nixos-rpi kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
Aug 05 18:24:10 nixos-rpi kernel: sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
Aug 05 18:24:10 nixos-rpi kernel: sd 0:0:0:0: [sda] Stopping disk
Aug 05 18:24:10 nixos-rpi kernel: sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK

And the same disk reappear a few seconds later as /dev/sdb:

Aug 05 18:24:13 nixos-rpi kernel: usb 1-1.1.3: new high-speed USB device number 10 using dwc2
Aug 05 18:24:13 nixos-rpi kernel: usb-storage 1-1.1.3:1.0: USB Mass Storage device detected
Aug 05 18:24:13 nixos-rpi kernel: scsi host1: usb-storage 1-1.1.3:1.0
Aug 05 18:24:14 nixos-rpi kernel: scsi 1:0:0:0: Direct-Access     TOSHIBA  External USB 3.0 5438 PQ: 0 ANSI: 6
Aug 05 18:24:14 nixos-rpi kernel: sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
Aug 05 18:24:14 nixos-rpi kernel: sd 1:0:0:0: [sdb] 5860533164 512-byte logical blocks: (3.00 TB/2.73 TiB)
Aug 05 18:24:14 nixos-rpi kernel: sd 1:0:0:0: [sdb] 4096-byte physical blocks
Aug 05 18:24:14 nixos-rpi kernel: sd 1:0:0:0: [sdb] Write Protect is off
Aug 05 18:24:14 nixos-rpi kernel: sd 1:0:0:0: [sdb] Mode Sense: 23 00 00 00
Aug 05 18:24:14 nixos-rpi kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 05 18:24:14 nixos-rpi kernel:  sdb: sdb1 sdb2 sdb4
Aug 05 18:24:14 nixos-rpi kernel:         ok 45 3 outputs: DSI0, TXP, HDMI1
Aug 05 18:24:14 nixos-rpi kernel: sd 1:0:0:0: [sdb] Attached SCSI disk

Besides the scary mechanical noise when the disk is stopped, the start/stop/start breaks the LVM volume group from the USB disk. I had to manually activate the LVM VG after boot.

Aug 05 18:24:17 nixos-rpi lvm[1463]: PV /dev/sdb4 8:20 is duplicate for PVID 63ff9cpFAsq3RJqUefGTDsOJbwrWetwm on 8:4 /dev/sda4.
Aug 05 18:24:17 nixos-rpi lvm[1463]: PV /dev/sdb4 failed to create online file.

The complete boot logs: rpi3_boot_linux_latest.txt · GitHub

FYI, no problem with the same USB disk and ArchlinuxARM (I just swapped the SD card to test NixOS on my rpi), so I guess it’s not an usb power issue.

I’m a bit lost, and I don’t know where to look or what to test. Any help will be very appreciated.

Ok, problem solved (or should I say “workarounded” ?).

TL ; DR

boot.blacklistedKernelModules = [ "onboard_usb_hub" ];

Long version:

Honestly, I don’t know why the module onboard_usb_hub is in fault.
I checked startup logs again and saw that all USB devices are disconnected during boot (not only the USB disk) right after onboard-usb-hub is loaded.

Aug 05 18:24:10 nixos-rpi kernel: usbcore: registered new device driver onboard-usb-hub
Aug 05 18:24:10 nixos-rpi kernel: usb 1-1.1: USB disconnect, device number 3
Aug 05 18:24:10 nixos-rpi kernel: lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): Failed to write register index 0x00000010. ret = -19
Aug 05 18:24:10 nixos-rpi kernel: lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): Registers INIT FAILED....
Aug 05 18:24:10 nixos-rpi kernel: lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): Bind routine FAILED
Aug 05 18:24:10 nixos-rpi kernel: usb 1-1.1.1: USB disconnect, device number 5
Aug 05 18:24:10 nixos-rpi kernel: usbcore: registered new interface driver lan78xx
Aug 05 18:24:10 nixos-rpi kernel: usb 1-1.1.3: USB disconnect, device number 6

After a few searches on the great Internet, I found a thread from december 2022 about a regression on this module and the rpi3.

So, I tried to blacklist this module, and … it works!