What to do with a full /boot partition?

I have a Raspberry Pi which I use as a git server at my home network. It’s got NixOs on it, but now nixos-rebuild commands fail due to a full /boot partition. I found this issue, but there are no instructions for not-very-linux-savvy people like me on how to solve the issue (not the fact that this is happening but what to do in my shoes).

What do I have to do in order to make room to my /boot partition? Here’s the output of df -k:

Filesystem                  1K-blocks    Used Available Use% Mounted on
devtmpfs                        46532       0     46532   0% /dev
tmpfs                          465288       0    465288   0% /dev/shm
tmpfs                          232644    6248    226396   3% /run
tmpfs                          465284     360    464924   1% /run/wrappers
/dev/disk/by-label/NIXOS_SD  59630144 5539304  51630828  10% /
tmpfs                          465284       0    465284   0% /sys/fs/cgroup
/dev/mmcblk0p1                 122622  122622         0 100% /boot
tmpfs                           93056       0     93056   0% /run/user/1001
5 Likes

You can limit the amount of configurations for grub and uboot. Personally though, I just combine / and /boot in just one partition and limit the maximum configurations to one since I’m half certain I won’t break it.

  boot = {
    kernelPackages = pkgs.linuxPackagesFor pkgs.linux_rpi_4_19;

    kernelParams = [
      "cma=32M"
      # Serial Console
      "console=tty0"
      "console=ttyS0,115200n8"
      "console=ttyAMA0,115200n8"

      # deadline I/O scheduler
      "elevator=deadline"
    ];

    loader = {
      grub.enable = false;

      raspberryPi = {
        enable = true;
        version = 3;
        firmwareConfig = ''
          # boot_delay=1
          # force_turbo=1
          gpu_mem=64
        '';

        uboot = {
          enable = true;
          configurationLimit = 1;
        };
      };
    };
  };

Though on my main machine, I periodically (cough weekly) run nixos-rebuild switch; nix-clean; nixos-rebuild switch where nix-clean is a function of

nix-clean () {
  nix-env --delete-generations old
  nix-store --gc
  nix-channel --update
  nix-env -u --always
  for link in /nix/var/nix/gcroots/auto/*
  do
    rm $(readlink "$link")
  done
  nix-collect-garbage -d
}
2 Likes

You should be able to do something like:

sudo nix-collect-garbage --delete-older-than 5d
1 Like

I do

nix-env -p /nix/var/nix/profiles/system --delete-generations +2

With --delete-older-than 5d once you will hit the issue with “too many kernels was in the last 5 days” which overflows /boot, so limiting the exact number of generations with approximated value of how many kernel and initrd could fit in /boot provides some kind of guarantee against /boot overflow

if “too many kernels in the last 5 days” sounds ridiculous, “too many changes in initrd” (secrets, initrd-networking settings, etc) does the same thing

5 Likes

Thank you for the suggestions, but unfortunately none of those help. After nix-env -p /nix/var/nix/profiles/system --delete-generations +2 and sudo nix-collect-garbage --delete-older-than 5d the /boot partition is still full. Can I somehow inspect what is filling it?

1 Like

You’ll need to run nixos-rebuild again; that will free up the space on /boot that was associated with the deleted generations.

4 Likes

Well that is precisely the problem. I cannot run nixos-rebuild since it fails with no space left on device:

[lassi@snadi:~]$ sudo nixos-rebuild boot
[sudo] password for lassi:
building Nix...
building the system configuration...
cat: write error: No space left on device
warning: error(s) occurred while switching to the new configuration
6 Likes

OK. Not sure how to help then, but can give some background (as I’ve had this in the past too).

In my case, compressed kernel and initrd images accumulate under /boot/EFI/nixos/ (I’m on x86_64 and using systemd-boot) due to updated kernels and changes affecting the initrd in successive generations, noting that each image may be shared across a number of generations. I’ve never tried NixOS on an RPi, but I imagine the partition is filling up for similar reasons.

Clearing a sufficient number of old generations (so that the associated kernel and initrd images aren’t needed anymore) than running nixos-rebuild should work, as it’s what’s suggested for this issue under Chapter 35.1: NixOS Boot Entries in the NixOS Manual. In my case, it always has worked: nixos-rebuild has cleared the unneeded images off the /boot partition first, and then applied the new generation… which succeeds as there is now free space under /boot to write the new images.

Hopefully someone with more knowledge than me (I’m also fairly new to NixOS) can help you out.

1 Like

…which I’ve just realised is the opposite behaviour to what is described in that open bug report you linked in the original post! But nonetheless, I really have solved it that way on my machine in the past.

Really hoping someone with more knowledge than me stops by this thread to help you out! :slightly_smiling_face:

Assuming the links to the old generations in /nix/var/nix/profiles have been deleted and nix-collect-garbage has been run, you should be able to clear out the extra entries in /boot with:
/run/current-system/bin/switch-to-configuration switch

I’m harvesting some questions coming from newcomers (and not-so-new as well) and this appears more than I’d think it would, maybe we should state clearly in the manual that not only the store increase in size during normal operation but /boot does as well if it is a separate partition.

The issue is that nixos-rebuild removes old kernels if the generations referencing them have been garbage collected but the operation is done after the kernel of the generation you’re trying to switch/boot into is installed.

At the time I didn’t know this and installed NixOS on a small boot partition and haven’t move since then so this is the workflow I’ve developed. It assumes you’re a normal user with sudo privilege, if you’re not do everything as root.

Make sure you understand every step or you could end up with a non bootable system

  1. Do a sudo nixos-rebuild build so that you’re sure that the build of your current configuration can be carried out
  2. Do a garbage collection to remove old system generations with sudo nix-collect-garbage -d
  3. Manually make some space in boot. Find your kernels and rm them.
  4. Run sudo nixos-rebuild switch or sudo nixos-rebuild boot. This time your bootloader will be installed correctly along with the new kernel and initrd
  5. Make sure point 4 was executed correctly by looking at the output and reboot
  6. [optional] remove the result directory created by point 1
14 Likes

Thanks for the instructions. How do I know which files are safe to delete?

1 Like

Well, I tried removing the oldest kernel. After it, sudo nixos-rebuild boot gave me this:

rm: cannot remove '/boot/nixos/zbpbdvcazhh44sfzv9rgy9wllnab66v5-linux-4.18.7-dtbs/broadcom': Read-only file system
warning: error(s) occurred while switching to the new configuration

Edit: If it helps, I have this extra bit in my configuration.nix (as suggested here: NixOS on ARM/Raspberry Pi - NixOS Wiki):

  hardware.enableRedistributableFirmware = true;
  hardware.firmware = [
    (pkgs.stdenv.mkDerivation {
      name = "broadcom-rpi3-extra";
      src = pkgs.fetchurl {
        url = "https://raw.githubusercontent.com/RPi-Distro/firmware-nonfree/54bab3d/brcm80211/brcm/brcmfmac43430-sdio.txt";
        sha256 = "19bmdd7w0xzybfassn7x4rb30l70vynnw3c80nlapna2k57xwbw7";
      };
      phases = [ "installPhase" ];
      installPhase = ''
        mkdir -p $out/lib/firmware/brcm
        cp $src $out/lib/firmware/brcm/brcmfmac43430-sdio.txt
      '';
    })
  ];
1 Like

The boot operation actually rebuild everything it need so if a file is missing it will be added and if it’s part of an old generation it will be removed. I never encountered the read-only filesystem issue, on the ARM board /boot is maybe handled differently? (pinging @dezgeg as the builder/maintainer of some ARM installation images)

@ilikeavocadoes Any update on this? Did you succeed?

Thanks for checking up! I didn’t, I still get the same error of Read-only file system when nixos-rebuild is removing old files. Don’t really know what to do to it since sudo rm -rf <the file> also results in Read-only filesystem.

Edit.
I just noticed that ls cannot access the directory: ls: cannot access '/boot/nixos/zbpbdvcazhh44sfzv9rgy9wllnab66v5-linux-4.18.7-dtbs/broadcom': Input/output error

I had this disk full problem when upgrading a machine to 19.03, I guess the 77th generation was just one too many!

This saved the day, cleared out a bunch of generations, and nixos-rebuild switch --upgrade was able to complete. Happy days!

5 Likes

post the output for ls -alhR /boot here so we can see exactly what’s taking up space

nixos-rebuild calls switch-to-configuration which calls the system.build.installBootLoader script on boot | switch.

That logic varies between bootloaders and I haven’t verified how they work. All of them should have a logic similar to this:

  1. Clean /boot from all the generations
  2. Install the /boot entries from the newest to the oldest
  3. On disk full, complain and let the user know which generations are missing, but don’t fail the script

Make sure that (3) fails if there is not generation and make sure that (2) doesn’t fail.

1 Like

I’m pretty sure some of them try to install the new generation before cleaning things up. That’s a safer heuristic but also fails hard if /boot is full. Combined with the fact the nix-store --gc doesn’t collect old boot entries, it’s a deadly mix.

1 Like