Netboot into NFS root instead of the fat netboot ramdisk

Hi. I’m experimenting with the UEFI firmware for the Raspberry Pi 4 booting the mainline 5.8 kernel. Ideally I’d like to netboot it. But I don’t want to use the existing netboot infrastructure exactly as it is, since the initrd it builds is quite large, and I’d prefer to use NFS to serve a read-only root (and then maybe another NFS mount layered on top of /var/lib for persistent modifications).

I have a general idea of how this should look, but I am thinking someone has this sitting around somewhere in one of their Nix repos and could give me a jumpstart…

Also, I’m open to other suggestions that involve a netboot without a fat initrd. Thanks!

1 Like

An update, and call for help:

  1. I have a Raspberry Pi 4 (rpifour1) booting u-boot and then mainline kernel.
  2. I have another RPi4 (rpifour2) that directly netboots mainline (via atftpd) the first one.
  3. I can toggle the power to rpifour2 via a script and the home-assistant api so I can iterate easily
  4. rpifour1 monitors the serial output from rpifour2 so I can see the kernel boot log without HDMI, can copy/paste/etc.

When the second one boots, the kernel and initrd are loaded, the kernel starts to boot, and then it fails to mount root and I get this error:

# snipped
[    2.634172] Key type .fscrypt registered
[    2.638205] Key type fscrypt-provisioning registered
[    2.648414] fe201000.serial: ttyAMA0 at MMIO 0xfe201000 (irq = 24, base_baud = 0) is a PL011 rev2
[    2.657692] serial serial0: tty port ttyAMA0 registered
[    2.679032] raspberrypi-firmware soc:firmware: Attached to firmware from 2020-11-30T22:12:08
[    2.760065] dwc2 fe980000.usb: supply vusb_d not found, using dummy regulator
[    2.767544] dwc2 fe980000.usb: supply vusb_a not found, using dummy regulator
[    2.875860] dwc2 fe980000.usb: EPs: 8, dedicated fifos, 4080 entries in SPRAM
[    2.898010] sdhci-iproc fe300000.sdhci: allocated mmc-pwrseq
[    2.957512] mmc0: SDHCI controller on fe300000.sdhci [fe300000.sdhci] using PIO
[    2.973255] ALSA device list:
[    2.976417]   No soundcards found.
# snipped noisy mmc controller fail
[    3.096075] VFS: Cannot open root device "nfs" or unknown-block(0,255): error -6
[    3.103706] Please append a correct "root=" boot option; here are the available partitions:
# snipped list of memory partitions
[    3.212496] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,255)
[    3.221086] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.15 #1-NixOS
[    3.227723] Hardware name: Raspberry Pi 4 Model B Rev 1.2 (DT)
[    3.233652] Call trace:
[    3.236154]  dump_backtrace+0x0/0x1d8
[    3.239881]  show_stack+0x20/0x68
[    3.243255]  dump_stack+0xd0/0x12c
[    3.246714]  panic+0x164/0x364
[    3.249823]  mount_block_root+0x2b0/0x344
[    3.253900]  mount_root+0x78/0x88
[    3.257270]  prepare_namespace+0x138/0x178
[    3.261435]  kernel_init_freeable+0x270/0x2c4
[    3.265867]  kernel_init+0x1c/0x128
[    3.269416]  ret_from_fork+0x10/0x34
[    3.273055] SMP: stopping secondary CPUs
[    3.277054] Kernel Offset: 0x4f96b9000000 from 0xffff800010000000
[    3.283248] PHYS_OFFSET: 0xffffaa4fc0000000
[    3.287500] CPU features: 0x0240022,61806000
[    3.291840] Memory Limit: none
[    3.294954] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,255) ]---

I have found old mailing list threads that hint that maybe we need to go out of our way to provide an nfsmount:

At the same time though, I’m very suspicious that I don’t see anything about genet in the kernel init log, despite my forcing it into initrd via boot.initrd.kernelModules. On the rpifour1 booting from USB-NVME, it clearly shows genet in the log and then eth0 appearing.

Relevant config for rpifour2:

{
  imports = [
      "${modulesPath}/installer/netboot/netboot.nix"
  ];
  config = {
      boot.kernelPackages = pkgs.linuxPackages_latest;
      boot.initrd.supportedFilesystems = lib.mkForce [ "vfat" "nfs" ];
      boot.initrd.kernelModules = [
        "nfs" "genet" "broadcom"
        "xhci_pci" "libphy" "bcm_phy_lib"
      ];

      boot.kernelModules = config.boot.initrd.kernelModules;
      networking.hostName = "rpifour2";
      networking.useDHCP = true;

      boot.initrd.network.enable = true;
  };
}

The resulting cmdline.txt that is provided to the pi during netboot winds up looking like this:

earlycon=uart8250,mmio32,0xfe215040 ip=dhcp root=/dev/nfs nfsroot=192.168.1.2:/rpifour2,vers=4.1,proto=tcp ro rootwait elevator=deadline init=/nix/store/nqnz45qm8p9649w5hcg8pzma0vm223bg-nixos-system-rpifour2-21.03pre-git-ab8ca6d6/init isolcpus=3 nfsrootdebug

(notice the enabling of boot.initrd.network.enable and the forcing of genet in initrd)



I feel like someone out there has a nixos config with nfsroot somewhere. :slight_smile: I can pick up the pieces if you throw it up in a gist! I think it would be really cool for this scenario to work out of box for NixOS. If anyone is interested in working on this, let me know. (same username on colemickens@freenode or @colemickens:matrix.org).

(Maybe this is also something that could/would be addressed in the course of converting more of our init to using systemd?)

I think this actually comes down to busybox’s mount only support nfsv3.

And I can’t manage to mount my NFS share via nfsvers=3…?

    services.nfs.server = {
      enable = true;
      exports = ''
        /export             192.168.1.0/24(fsid=0,ro)
        /export/rpifour2    192.168.1.0/24(ro,nohide,no_root_squash,insecure,no_subtree_check)
      '';
    };

❯ sudo mount.nfs -o vers=4 192.168.1.2:/rpifour2 /tmp/test

~
❯ ls /tmp/test
nix  root-rpifour2-test

~
❯ sudo umount /tmp/test                                   

~
❯ sudo mount.nfs -o vers=3 192.168.1.2:/rpifour2 /tmp/test
Failed to start rpc-statd.service: Unit rpc-statd.service not found.
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.

The NFS docs make it seem like v3 should be enabled by default.

Update: there are a couple of routes (at least) to go down:

  1. Pure nfsroot, skip initrd, boot kernel with nfsroot=... set
    • this requires a custom kernel config for rpi4 to include GENET builtin
  2. Leverage nixos stage-1
    • just put needed network, etc drivers into initramfs
    • let stage-1 setup network
    • let stage-1 mount with busybox

I chose option 2, since I don’t have experience with modifying existing kernel configs. For completeness I might circle back around and see about accomplishing the first approach, as well.

On option 2, I’m getting close:

loading module dm_mod...
[    3.753016] device-mapper: ioctl: 4.43.0-ioctl (2020-10-01) initialised: dm-devel@redhat.com
loading module af_packet...
running udev...
Starting version 247
bringing up network interface eth0...
[    4.148318] bcmgenet fd580000.ethernet: configuring instance for external RGMII (RX delay)
[    4.157219] bcmgenet fd580000.ethernet eth0: Link is Down
acquiring IP address via DHCP on eth0...
udhcpc: started, v1.32.1
udhcpc: sending discover
udhcpc: sending discover
[    8.256446] bcmgenet fd580000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
[    8.264700] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
udhcpc: sending discover
udhcpc: no lease, failing
kbd_mode: KDSKBMODE: Inappropriate ioctl for device
starting device mapper and LVM...
[   13.684725] random: lvm: uninitialized urandom read (4 bytes read)
mounting 192.168.1.2:/export/rpifour2 on /...
mount: mounting 192.168.1.2:/export/rpifour2 on /mnt-root/ failed: Network is unreachable

An error occurred in stage 1 of the boot process, which must mount the
root filesystem on `/mnt-root' and then start stage 2.  Press one
of the following keys:

  r) to reboot immediately
  *) to ignore the error and continue

Alright, I threw in a sleep in the initrd-network process so the link has time to actually come up before we start asking udhcpd for a lease.

Even closer…

running udev...
Starting version 247
bringing up network interface eth0...
[    4.139873] bcmgenet fd580000.ethernet: configuring instance for external RGMII (RX delay)
[    4.148718] bcmgenet fd580000.ethernet eth0: Link is Down
[    8.227971] bcmgenet fd580000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
[    8.236213] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
acquiring IP address via DHCP on eth0...
udhcpc: started, v1.32.1
udhcpc: sending discover
udhcpc: sending discover
udhcpc: sending select for 192.168.1.130
udhcpc: lease of 192.168.1.130 obtained, lease time 86400
kbd_mode: KDSKBMODE: Inappropriate ioctl for device
starting device mapper and LVM...
[   12.516577] random: lvm: uninitialized urandom read (4 bytes read)
mounting 192.168.1.2:/export/rpifour2 on /...
[  203.811802] random: crng init done
[  318.435774] svc: failed to register lockdv1 RPC service (errno 110).
[  318.442310] lockd_up: makesock failed, error=-110
mount: mounting 192.168.1.2:/export/rpifour2 on /mnt-root/ failed: Connection timed out

An error occurred in stage 1 of the boot process, which must mount the
root filesystem on `/mnt-root' and then start stage 2.  Press one
of the following keys:

  r) to reboot immediately
  *) to ignore the error and continue