Use of shallow copy / reflinks on BTRFS/XFS/ZFS

For nix flakes, inputs are copied into the nix store(from what I understand). Sometimes these inputs are from filesystem paths. I know nix can do deduplication manually and BTRFS and ZFS can do it at the FS level, but it would be nice if, when copying from a local path to the /nix/store, nix would try to use a reflink / shallow copy instead of fully copying the files. This would save both space and time and take advantage of the CoW filesystem’s native copy operations.

Does nix already do this? If not, how difficult would it be to add (happy to do the work with guidance)?

1 Like

What I think you mean:

  • when copying something to the nix store, make a hard link between the existing location and the store one.

Problems with this:

  • it would only work if the source and destination are on the same filesystem, such as where everything is in a single global /. This might be common enough to be worth trying, except that:
  • it would allow manipulation of the store contents through the hard link; the store is mounted read-only but the original file is outside that

@uep hard links and reflinks are different things. Reflinks are usually only possible on copy-on-write file systems, and can be made with cp --reflink=<always|never|auto>. Reflinks are indistinguishable from regularly copied files, but they can be created instantly and consume no space at first. They share all their blocks on disk with the original, but when either is modified, the new data is written to new locations on disk, and only the modified one is updated to point to that new data.

6 Likes

TIL, and why I stated the clarification; thanks

1 Like

Yeah as @ElvishJerricco said, basically a reflink is equivalent to a copy, just faster. I have my nix store on a large magnetic disk array. I know I can probably spend some time tuning it for small copies. Or I could have nix do a reflink, which is a fast, basically instant copy.

As for the same filesystem… yes that is true. IIRC, the filesystem ioctl (FICLONE) that does this will fail if the filesystems are different, and then you fallback to normal copying.

Copy-with-extra semantics is not realiable or portable. Copy alone is not a universal filesystem feature.

@ehmry FWIW, cp --reflink=auto will just fall back to copying if reflink isn’t available, so it’s always safe to use.

3 Likes

By “always safe to use” do you mean Linux+GNU coreutils or something less specific?

Well nixpkgs only uses GNU coreutils, even on macOS

2 Likes

cp is a shell utility and Nix is not a shell script. “Copy” is an abstract concept, it’s not something that you can do at a low level without breaking abstractions and making code more expensive to mantain.

Considering the portion of nix users who use it on linux, a fair number of whom have a CoW filesystem, that maintenance burden seems entirely justifiable. It’s really not difficult to try a reflink first before initiating a normal copy, but only on linux.

1 Like

Unfortunately reflinks are not even support on ZFS openzfs/zfs#405. I have seriously considered switching to btrfs because of it, but haven’t so far. If Nix could take advantage of this that might push me over the edge.

2 Likes

I want to try btrfs , but i read some many bad things about it, maybe the bad things i read about it are not true, and as with all software, it gets better over time.

Maybe keeping things the can be easily rebuild on btrfs , and more inmportant stuff on zfs?

Note that GNU coreutils uses copy-on-write for cp and install by default in the versions currently in master and 22.11, so --reflink=auto is redundant.

This doesn’t have an effect on whatever Nix itself does, obviously. I think there might be a NAR serialization step in ingesting stuff to store?

2 Likes

One barrier to reflinks that has existed was that you couldn’t create reflinks across VFS boundaries. Incidentally, the nix-daemon creates one by default via the read-only bind mount.

At least for btrfs, this has been fixed in recent kernels which allow reflinks inside the same FS even across VFS boundaries, so this is something we can actually start to actively use now with 6.1 becoming the new LTS kernel. Would be great to have Nix use reflinks wherever possible for 23.05.

4 Likes

Remember when the Linux API became stablized and we got all those games ported over and performance was better and we didn’t have to use Wine anymore? This is going to be like that.

1 Like

How on earth is the entire gaming industry analogous to “we want to add code in the Nix package so it can reflink when it would copy sometimes”?

I assume in the general “some background thing was made faster so lots of stuff benefits” way? It’s a stretch to me, too, though.

Interesting idea, but how would this interact with the upcoming source tree abstraction? From my understanding that PR eliminates the need to even copy many inputs to the store since they are fetched lazily. Things like patches and zip extractions are done in-memory which also saves a ton of disk space. I suppose that shallow copies could still be used for situations where store paths are materialized but a more generalized solution might be desirable - as discussed this would only really apply to BTRFS.

Source tree abstraction only helps in certain cases. If you set src = self; in your derivation, the whole flake is getting copied anyway, and it would still be nice if it was done with a reflink if possible. Same issue if you pin a copy of self in your registry in a nixos config.