How can I clean up /boot, without a reboot

I have garbage cleaned, but I imagine /boot needs a reboot?

df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3,2G     0  3,2G   0% /dev
tmpfs            32G  149M   32G   1% /dev/shm
tmpfs            16G  7,8M   16G   1% /run
tmpfs            32G  720K   32G   1% /run/wrappers
/dev/dm-1       220G   70G  139G  34% /
/dev/sda1       511M  511M     0 100% /boot
tmpfs           6,3G   41M  6,3G   1% /run/user/1000
//thunar/share  952G  172G  780G  19% /home/b0ef/mnt/thunar-share
[b0ef@ximian:~]$

I can’t build anything, cause it needs space on /boot, but I don’t want to reboot.

Any pointers as to what I can do?:wink:

This is a long-standing nixos-rebuild system activation script bug: When /boot is full, system rebuilds fail · Issue #23926 · NixOS/nixpkgs · GitHub

What you need to do in this case is to manually remove one or two old kernel+initrds that were part of the GC’d generations.

Then you can run nixos-rebuild boot again. It’ll first copy over the current generation’s kernel+initrd and then clear out the rest of the GC’d generations’ files from /boot/ automatically.

Edit: The fault is in the system activation script that gets executed by nixos-rebuild, not nixos-rebuild itself.

4 Likes

I struggled with this a lot, and was frightened by the suggestions to delete kernels that I hoped I wasn’t using I’d like to test my new understanding and hopefully comfort other people

  1. nixos-rebuild switch and maybe some other commands will make sure that /boot/EFI/nix and /nix/store contain everything needed for all of the system profiles in /nix/var/nix/profiles/. After they’ve done that they remove any unused files from /boot/EFI/nix
  2. This explains why it’s not enough to remove old generations/profiles
  3. It also explains why manually deleting kernels doesn’t risk creating broken entries in grub – nixos-rebuild will put everything back (as long as you don’t crash or restart while in this state)
  4. If you use flakes, it’s even better than this. You can remove old generations, re run nixos-rebuild to rebuild the current installed version of your config (assuming that you still have it – git-commit is your friend), thus requiring no new space on /boot/. Maybe this is possible without flakes, but I never figured out how. This frees up space on /boot and now you can upgrade as normal

Did I get this right?

1 Like

That is correct but I don’t know what you mean by the flake part. Flakes don’t play into this at all.

What I meant is that before I managed my system with a flake, when I ran out of space on /boot I was pretty stuck if nixos-rebulid wanted to install a new kernel. I don’t know how to tell it to rebuild the current setup and clean up any unused kernels instead of installing a new kernel first. Thus, I had to remove older generations, and then remove matching kernels to free up space manually

With a flake system, though, I can remove old generations of my profile, re-run nixos-rebuild switch --flake ... and have it remove any unused kernels. This feels much less sketchy to me

The same is true without flakes.

The core of what flakes do is a standardised method to manage external dependencies. How you do that has no influence on whether the activation script removes unused kernels after you remove old generations; it’s entirely independent.

Oh, great. In that case, why is that standard advice to manually remove files from /boot instead of just removing old generations and then regenerating the current profile, cleaning up /boot at the same time?

Because (at least at the time I wrote said advice), the activation script would first copy new kernels+initrds to /boot/ and only then delete unused ones.

You need to do both: Remove the profile, delete enough unused files from /boot/ so that any potential new files can be copied and then run the activation script.
If you only deleted old kernels, the activation script would copy them right back; using the space again.

But if we are just rebuilding/reactivating the current setup then there is no new kernel, right?

Indeed.

Though you will typically only notice the need to clean up when it’s already too late and you’ve run out of space.

But that’s my point. I think that you can do this when you have already run out of space, by removing old generations (but not kernels) and then rebuilding the current profile, so that no new kernel is required

The problem in the scenario I described is that you typically only notice that you’ve that you’ve run out of space when you’ve already built and attempted to activate a new generation with new kernel/initrd. The current profile would not be in /boot/ yet. (Adding the generation and switching the runtime to it works even if /boot/ is full.)

Activating a previous profile also wouldn’t help in this case as the activation script copies kernels and initrds of all available generations.

Hmm. Next time I run out of space I’m going to take careful note about how I resolve this. I still think that I could remove the new generation and some older generations, rebuild the previous successfully installed generation and resolve the problem. But I’ll let you know