I have a Raspberry Pi which I use as a git server at my home network. It’s got NixOs on it, but now nixos-rebuild commands fail due to a full /boot partition. I found this issue, but there are no instructions for not-very-linux-savvy people like me on how to solve the issue (not the fact that this is happening but what to do in my shoes).
What do I have to do in order to make room to my /boot partition? Here’s the output of df -k:
You can limit the amount of configurations for grub and uboot. Personally though, I just combine / and /boot in just one partition and limit the maximum configurations to one since I’m half certain I won’t break it.
With --delete-older-than 5d once you will hit the issue with “too many kernels was in the last 5 days” which overflows /boot, so limiting the exact number of generations with approximated value of how many kernel and initrd could fit in /boot provides some kind of guarantee against /boot overflow
if “too many kernels in the last 5 days” sounds ridiculous, “too many changes in initrd” (secrets, initrd-networking settings, etc) does the same thing
Thank you for the suggestions, but unfortunately none of those help. After nix-env -p /nix/var/nix/profiles/system --delete-generations +2 and sudo nix-collect-garbage --delete-older-than 5d the /boot partition is still full. Can I somehow inspect what is filling it?
Well that is precisely the problem. I cannot run nixos-rebuild since it fails with no space left on device:
[lassi@snadi:~]$ sudo nixos-rebuild boot
[sudo] password for lassi:
building the system configuration...
cat: write error: No space left on device
warning: error(s) occurred while switching to the new configuration
OK. Not sure how to help then, but can give some background (as I’ve had this in the past too).
In my case, compressed kernel and initrd images accumulate under /boot/EFI/nixos/ (I’m on x86_64 and using systemd-boot) due to updated kernels and changes affecting the initrd in successive generations, noting that each image may be shared across a number of generations. I’ve never tried NixOS on an RPi, but I imagine the partition is filling up for similar reasons.
Clearing a sufficient number of old generations (so that the associated kernel and initrd images aren’t needed anymore) than running nixos-rebuild should work, as it’s what’s suggested for this issue under Chapter 35.1: NixOS Boot Entries in the NixOS Manual. In my case, it always has worked: nixos-rebuild has cleared the unneeded images off the /boot partition first, and then applied the new generation… which succeeds as there is now free space under /boot to write the new images.
Hopefully someone with more knowledge than me (I’m also fairly new to NixOS) can help you out.
…which I’ve just realised is the opposite behaviour to what is described in that open bug report you linked in the original post! But nonetheless, I really have solved it that way on my machine in the past.
Really hoping someone with more knowledge than me stops by this thread to help you out!
Assuming the links to the old generations in /nix/var/nix/profiles have been deleted and nix-collect-garbage has been run, you should be able to clear out the extra entries in /boot with: /run/current-system/bin/switch-to-configuration switch
I’m harvesting some questions coming from newcomers (and not-so-new as well) and this appears more than I’d think it would, maybe we should state clearly in the manual that not only the store increase in size during normal operation but /boot does as well if it is a separate partition.
The issue is that nixos-rebuild removes old kernels if the generations referencing them have been garbage collected but the operation is done after the kernel of the generation you’re trying to switch/boot into is installed.
At the time I didn’t know this and installed NixOS on a small boot partition and haven’t move since then so this is the workflow I’ve developed. It assumes you’re a normal user with sudo privilege, if you’re not do everything as root.
Make sure you understand every step or you could end up with a non bootable system
Do a sudo nixos-rebuild build so that you’re sure that the build of your current configuration can be carried out
Do a garbage collection to remove old system generations with sudo nix-collect-garbage -d
Manually make some space in boot. Find your kernels and rm them.
Run sudo nixos-rebuild switch or sudo nixos-rebuild boot. This time your bootloader will be installed correctly along with the new kernel and initrd
Make sure point 4 was executed correctly by looking at the output and reboot
[optional] remove the result directory created by point 1
The boot operation actually rebuild everything it need so if a file is missing it will be added and if it’s part of an old generation it will be removed. I never encountered the read-only filesystem issue, on the ARM board /boot is maybe handled differently? (pinging @dezgeg as the builder/maintainer of some ARM installation images)
Thanks for checking up! I didn’t, I still get the same error of Read-only file system when nixos-rebuild is removing old files. Don’t really know what to do to it since sudo rm -rf <the file> also results in Read-only filesystem.
I just noticed that ls cannot access the directory: ls: cannot access '/boot/nixos/zbpbdvcazhh44sfzv9rgy9wllnab66v5-linux-4.18.7-dtbs/broadcom': Input/output error
I’m pretty sure some of them try to install the new generation before cleaning things up. That’s a safer heuristic but also fails hard if /boot is full. Combined with the fact the nix-store --gc doesn’t collect old boot entries, it’s a deadly mix.