`nixos-rebuild switch` is failing to install grub, `/boot` partition disappeared

Hello :wave: ,

My issue manifested with nixos-rebuild switch erroring out with the following log:

$ sudo nixos-rebuild switch --flake .# --install-bootloader
building the system configuration...
updating GRUB 2 menu...
installing the GRUB 2 boot loader on /dev/sda...
Installing for i386-pc platform.
/nix/store/k16fsfwnccjmknzrrqhcmwm3l8g2p61d-grub-2.06/sbin/grub-install: warning: this GPT partition label contains no BIOS Boot Partition; embedding won't be possible.
/nix/store/k16fsfwnccjmknzrrqhcmwm3l8g2p61d-grub-2.06/sbin/grub-install: error: embedding is not possible, but this is required for RAID and LVM install.
/nix/store/lr12rz9yj3rkfbgkdlcj7d87wrsi972a-install-grub.pl: installation of GRUB on /dev/sda failed: Inappropriate ioctl for device
warning: error(s) occurred while switching to the new configuration

Looking around on the system, I noticed that my /boot partition disappeared(!).

$ lsblk
NAME          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
fd0             2:0    1    4K  0 disk
sda             8:0    0  1.8T  0 disk
└─sda1          8:1    0  1.8T  0 part
  └─lvm-media 254:0    0  3.6T  0 lvm  /nix/store
                                       /
sdb             8:16   0  1.8T  0 disk
├─sdb1          8:17   0  1.8T  0 part
│ └─lvm-media 254:0    0  3.6T  0 lvm  /nix/store
│                                      /
└─sdb2          8:18   0   16G  0 part [SWAP]
$ sudo ls /boot
background.png  converted-font.pf2  grub
$ df -h /boot
Filesystem                Size  Used Avail Use% Mounted on
/dev/disk/by-label/nixos  3.6T  3.3T  113G  97% /

This is surprising to me as I should instead have the following setup (according to both my memories and the install script for that server):

  • Two disks of 2TB
    • One has three partitions
      • Boot partition
      • Media/root partition
      • Swap partition
    • The second disk has a single partition, to be used through LVM as a media/root partition with the other disk’s.

I have a few questions:

  1. How did I lose my /boot partition?
  2. Is that why nixos-rebuild switch fails?
  3. How do I recreate it, since this is a server I would like to try and avoid shutting it down in case I can never reboot it…
  4. Can I avoid this happening again in the future?

Refs:

Mhmmm I’m not sure exactly how it happened, but here’s how I fixed it:

  1. Notice that it looked like /dev/sda and /dev/sdb had switched between my initial install and the current state of the system, somehow…
  2. Made /dev/sdb1 bootable through parted /dev/sdb set 1 boot on, as it didn’t have that flag set.
  3. Changed boot.loader.grub.device to use /dev/sdb since that’s the one that should contain the boot partition.

At least nixos-rebuild switch finally worked.

If anybody has comments on what happened here, or how I can improve my config to avoid this in the future, I’d gladly take your advice.

I’ve had some boot confusion in the past that were caused by similar issue: sda and sdb switching. I’ve since swore to only rely on /dev/disk/by-id/ in boot.loader.grub.device.

2 Likes

I had this same issue. I believe it happened after I added new mounting points.

I followed the these steps to find the dev by-id as recommended by badcold

❯ df -h /boot/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdc1       882G  506G  332G  61% /
nixosđź”’ on î‚  main [$!+] took 13s
❯ blkid /dev/sdc1
/dev/sdc1: UUID="2a01f7ff-22d8-41ea-9a7f-cea3f739b1ef" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="0902a176-01"
nixosđź”’ on î‚  main [$!+] took 11s
❯ ls  /dev/disk/by-id/ | grep "sdc1" --color
lrwxrwxrwx 1 root root  10 May 13 19:25 ata-CT1000BX500SSD1_2516E9B6981F-part1 -> ../../sdc1
lrwxrwxrwx 1 root root  10 May 13 19:25 wwn-0x500a0751e9b6981f-part1 -> ../../sdc1
nixosđź”’ on î‚  main [$!+] took 10s

In my config I changed the boot.loader.grub.device

boot.loader.grub.device = "/dev/disk/by-id/wwn-0x500a0751e9b6981f";

Note I couldn’t use by-uuid
Also I had to strip “-part1” of the link “wwn-0x500a0751e9b6981f-part1”

Hope this helps someone.

There’s a simpler way to get that info :slight_smile: assuming you’re not using LVM.

$ df /boot --output=source | tail -n1 | xargs lsblk -o NAME,ID
NAME ID
sda2 0x5002538d426d4415-part2

Last command is adapted from How do I get from /dev/sdi to /dev/disk/by-id? - #5 by ElvishJerricco. It does miss the wwn- prefix, so just be aware of and correct that.

And you have to drop the -partN bit because grub installed to a block device and /boot is on a partition. You generally do not install grub to a partition.

Thanks waffle8946

I’m also getting this error I’m not sure if its part of the same thing

This is what comes up at the end of my rebuild

Command 'systemd-run -E LOCALE_ARCHIVE -E NIXOS_INSTALL_BOOTLOADER -E NIXOS_NO_CHECK --collect --no-ask-password --pipe --quiet --service-type=exec --unit=nixos-rebuild-switch-to-configuration /nix/store/m7yk6bh4lcrlpsmk4pza8fgx9ay40p66-nixos-system-nixos-26.05.20260510.da5ad66/bin/switch-to-configuration switch' returned non-zero exit status 2.