Run LVM on-top of LUKS, or LUKS on-top of LVM?

I’ve been working on some default HDD layouts. I have tested the three layouts shown below. I think there is enough details on the image to explain the setup.

The thing I’m wondering about the most is whether this is the best way to layout the LUKS and LVM volumes. The main value of LVM is that you can easily grow and shrink volumes to suit, which suggest that the encrypted volume should be created on top of LVM. On the other hand, for security you want to leak as little about your configuration as possible so you want to encrypt as much as possible, suggesting you should run LVM on-top of LUKS. Given the day and age we live in it seems to me that information security should trump flexibility, particularly when storage is cheap (i.e. does not seem likely you would need to resize the encrypted volume.)

I want to derive a good default configuration that an installer should provide as a default recommended configuration. What do you guys think? Is there a better way that I am missing?

2 Likes

I do not do LUKS or any other kind of disk encryption, though I think having a big ESP (0.5 to 1 GiB) and a LUKS container spanning the remainder of the disk, with a LVM PV spanning the LUKS would give you the best of both worlds.

Everything that can be encrypted is encrypted and at the same time you have full flexibility of LVM.

I’d suggest to not recommend LVM or any other drive mapping things in an GUI installer, until we do have a GUI-manager for those as well.

I’d suggest to not recommend LVM or any other drive mapping things in an GUI installer, until we do have a GUI-manager for those as well.

Why is that? (Also, I thought GParted allready gave you this in GUI form?)

gparted can only create a PV, but only at the “root”, it can not create it on LUKS.

And in general, suggesting a “more flexible” layout by default, without giving the ability to actually use it the flexible way, its not better than any other layout. The installer is for people that do not want to use CLI tools, so you suggest a layout that can only be managed by exactly those (in the current state of nixpkgs).

For classic non-drivemapped layouts, people can at least use gparted to move around the partitions. It will take a while (days at worst) but its something they can handle and the way how gparted represents the partitions as “blocks” on a bigger “block”, makes it an easy to grasp construct.

1 Like

I see your point. I’ll see what I can do. In the mean time though, I do not think we should recommend a unencrypted setup because of a lack of GUI tools.

The Debian/Ubuntu/Fedora graphical installers offer to use LVM, even though there is no better GUI partition editor available there either.

While I don’t think LVM+encryption should be the default, I don’t think it’s unreasonable for a GUI installer to have an option for LVM/LUKS. Most people probably never need to touch the drive partitioning, so there’s hardly any harm in using one of those options, even if you can’t use cryptsetup or LUKS - as long as it’s not the default default and there’s a choice involved, so user frustration is with their own choices and not the tools doing things they didn’t know about :slight_smile:

That said, non-LUKS with an encrypted root, like what Ubuntu does by default, is probably the best trade-off between complexity and security for a pure GUI user.

1 Like

I have two ways to answer this question:

  1. No. ¹
  2. LVM-on-LUKS. None of your depicted options are LUKS-on-LVM anyway. The only real reason to do separate LUKS containers would be if you want to have separate keys for the different volumes, and probably only decrypt some of them some of the time - perhaps separate volumes for personal and work data, or similar. You only have root and swap so this isn’t the case.
    Now, swap you can have encrypted with a separate, random key generated each boot. This is probably a good idea and worth considering - whether inside or outside of LVM. But overall, I don’t see you’re getting much out of LVM here at all.

¹ - no, don’t use LUKS-on-LVM or LVM-on-LUKS. Use ZFS, and more datasets, that can share the space and give much more flexible policy-based controls over different datasets, including encryption.

1 Like

Even though I’d love to use ZFS, one always has to remember that its performance might massively suffer from choosing the wrong disk. On my system with a Samsung QVO 860 I had a constant load of 3 due to IO wait.

Also depending on the featureset used, ZFS might become very memory hungry.

I actually use LVM with thinly provisioned LVs to get a “ZFS like” space share.

Still I’d wish to also have VDO available for NixOS to be able to get a much closer-to-ZFS-experience by compression and deduplication…

1 Like

I worded that wrong, I did not mean one default configuration. I would like to present a “few sensible” configurations. The three encryption levels shown in the image (none, partial and full) seems to be most sensible to cover 90% of use cases (at least my use cases).

@uep, I did not show any LUKS-on-LVM, I only show the three layouts which I tested and all three were LVM-on-LUKS.

I had forgotten about ZFS, the last time I played with it you could not boot from ZFS, but that should be fixed now. I’m still not sure I want to support ZFS, there seems to be a lot of strive about ZFS in the Linux community.

That seems odd. In my experience, and based on what I’ve read, ZFS should only really become a bottleneck with fairly non-trivial arrays of drives. It should be negligible for simple setups like a single drive.

Before I had the problems I was under the same assumption. Though when I had the problems, I was able to find evidence that other QVO users had similar problems. Theory was that the ARC and the internal disk cache were “disagreeing” and waiting for each other to agree. Though as I am not a hardware nor a filesystem guy, this is even the simplficiation another kind user in the discord did for me.

By their understanding, it would have been possible to play with the ARC settings and cache modes, though some of them are not changeable on a running pool and we gave up after a day of tweaking the mutable ones.

I have to admit though, this was not from the beginning but a sneaking process. It took a couple of month before I realised there was this nasty slow down of my system, and another month until I was able to pinpoint this to “system load through IO”, before that incident, I always assumed “load” was solely CPU based.

1 Like

Samsung QVO is a budget SSD series using QLC NAND. These drives suffer from pretty hefty performance degradation as their SLC cache is depleted.

In theory, that shouldn’t occur unless performing large sequential writes, but I’ve seen plenty of reports about abysmal QLC performance (like yours) to conclude such SSDs are simply unfit as a boot drive or any serious I/O altogether - regardless of file system.

Any non-budget series (i.e. any SSD with TLC and DRAM) should be just fine with ZFS.

The feature set part is only relevant for setups enabling deduplication - which should only be used in very specific setups.

In general, having a bit more memory is useful because the ZFS ARC is separate from the Linux page cache. By default it’ll grow to 50% of system RAM - even though it should shrink under memory pressure, I’ve run into some OOM issues in the past on RAM limited systems. Limiting ARC (ie. setting zfs_arc_max) helps.

(Sorry for the digression)

2 Likes

Well, colours in Figure 1 are wrong.
Personally, I have systems running along the setup in Figure 2, but using MBR (older systems…) and some with an additional dm-integrity layer (LUKS2 with AEAD), some with ZFS on LUKS (instead of LVM).
Some points I learned along the way:

  • If you hibernate your system and it resumes from hibernation a second time at the next reboot instead of booting cleanly, dm-integrity will clearly tell you that some files are messed up…
  • No trimming on SSDs when using AEAD. (booting will fail when allowDiscards = true;)
  • Disk space is cheap, so add a bit more swap; systems may deadlock when swap is encrypted and system gets low on memory/swap.
  • Create separate volumes for the /nix/store, /, /home (others like /srv depending on machine purpose) for easier snapshotting, disk expansion etc., especially when using ZFS.

With that in mind, having a “prepare-disk-for-nixos” script in the installer ISO would be very nice.

@wamserma, good eye. I have fixed the colours, but I don’t think I can edit that post anymore.

The recommendation for swap size used to be 2 x RAM size, but I think that rule falls over a bit these days. Is there a new rule of thumb?

Disk space is unfortunately not super cheap with VMs yet, so perhaps the the option should be to select between a single root mount and the mounts you listed. I’ve been getting in the habit of putting /tmp on its own volume too. What sort of sizes would you recommend for those volumes?

Cheers for the Lessons-Learned list.

I’m using NixOS only on desktops, so I had those scenarios in mind.
For swap on desktops I usually follow the 2xRAM rule, but 16GB at most, depends on use case.
For VMs ZFS would give you the most flexibility in dividing storage space.
Maybe @grahamc can put in some advise here.

Especially for VMs my companies ops team prefers LVM in the guest, as you can shrink it. You can’t do that with ZFS.

1 Like

Do you guys really fiddle with your disk layout after install? Can you elaborate on use-cases if so?

On most laptops today you can have only one disk so I go with EFI partition and LUKS’ed / for the rest. Now that grub supports LUKS (still version 1 only though) it makes it easy and secure.

RAM’s been cheep for the last 7+ years, so I just buy more RAM and skip swap all together. All my laptops since 2013 I never needed swap.

My 2c.

Swap, at the very least, is useful for suspend features, which I think is especially important on laptops.

I like having flexible disk layouts, because a year or two into using a system I will inevitably realize I was too restrictive with my /nix or my root partition, and fixing that without LVM is practically impossible without fully reinstalling the system. Even if it’s only a temporary whoopsie (like some application logging far too much stuff), it can be tricky to fix the running system without giving it a bit more space.

Conversely, I might realize I’m consistently underusing the space there, which isn’t as catastrophic but still annoying.

Granted, I picked up this habit because most of my devices until recently were stuck on 250GB SSDs, which is actually fairly limited if your nix store frequently grows up to 40 and applications get larger and larger. Maybe as larger SSDs become more accessible this becomes less useful.

Still, I’ve been burnt a few times too many, running LVM is basically zero cost and I actually know how to use it these days, so well, why not?

Do you have a step-by-step procedural to properly setup a thinly provisioned LVM you can refer me to?

Check redhat docs for lvm.
But it should be something for raid1

pvcreate /dev/sda /dev/sdb
vgcreate my-vg
lvcreate --type raid1 -L 1G -T my-vg/thin-raid1-storage

-L 1G → size 1G
-T → thin provisioned
Would also do --raidintegrity y for soft corruption on lvcreate