Distributing the Nix store with CVMFS+Nix?

Leaving this topic in “Uncategorized” because it is more like mentioning a curiosity that won’t fit “Learn” and probably to verbose for “Links”…

Just found out about CernVM File System (CernVM-FS, CVMFS) which is a

software distribution service […] implemented as a POSIX read-only file system in user space (a FUSE module). Files and directories are hosted on standard web servers and mounted in the universal namespace /cvmfs.

The CVMFS overview states that

In contrast to general purpose network file systems such as nfs or afs, CernVM-FS is particularly crafted for fast and scalable software distribution. Running and compiling software is a use case general purpose distributed file systems are not optimized for.

and it doesn’t seem to prescribe any specific build system or package manager to use it with, and so Nix could be a perfect fit for this purpose. In fact, similar research projects already exist:

CVMFS with Nix sounds very similar to what flox is aiming for, but a direct comparison seems unfair because even after superficially looking at the resources above, CVMFS+Nix will require a lot of plumbing. An interesting idea though.

@siscia I initially found out about CVMFS from your post (just saw your announcement as well), and your name also popped up many times when I was doing the research. Are you still using this or similar setup? Would love to read about your experiences.


2 Likes

I think the main blocker for something like this is that only a single instance of the Nix daemon would be able to modify the store, otherwise there would be race conditions galore. It could still be possible though if Nix’s TCP support ever get’s encryption, but that could be a ways off if it ever happens at all.

Even then, you’d probably want to have some kind of “high availability” mode to the Nix daemon so if one instance fails, another can take over as the “leader”.

Still, it’s a very interesting idea. I really like what Flox is doing, but my main issue with it is the currently closed nature of the project. I’d much rather see Nix proper doing something like this.

2 Likes

From what little I gathered about CVMFS, this probably wouldn’t be an issue because only Stratum 0 is writeable anyway, and all else is supposed to be read-only. (Then again, I’m neither a Nix nor a CVMFS guru.)

Didn’t even know about this, thanks!

Not at home with HPC or high-availability systems, but this sounds like a direct quote from Erlang.

Not sure what the flox plans are in the long term, but its use case is more and more sought after; there are multiple threads on this topic just on the NixOS discourse alone where people are asking for this functionality, and even more have re-created it at their companies on different levels. The issue is duplication of efforts, and this process is not documented (only bits and pieces can be found here and there).

Hey there,

sorry to reply late, but I was not expecting to be summoned in this forum.

I used to work on CVMFS, and from the internal I can really say that it is a marvelous piece of engineering which unfortunately does not have the right recognition.

CVMFS is used to distribute complex software stacks, like the one found in HEP (High Energy Physics), thinks about using software to simulate the physics that should happen (from what we know) inside a particle accelerator.

The software used for those simulation end ups being very complex with very complex dependencies. Moreover, the software stack end ups being rather large.

For ATLAS (one of the 4 big CERN collaboration), the whole software stack, is measured in gigabytes.

The overall idea it is to build the software once, and distribute it to everybody interested. Either server doing simulations, physics developing simulations or software developers creating software.

Being so widely used, CVMFS is quite sophisticated. It features, storage deduplication (the same file is stored only once), compression, and it is highly cacheable and scalable. It supports object storage (S3 api) for storage, beside a classic file system.

You can see how CVMFS fits quite well in what Nix is trying to do, and it is natural that people who are aware of both systems try to put them together.

I really believe that the two systems are a very great match.

However.

The main difficulty in using CVMFS and Nix is that both of them requires a root directory to work well.

CVMFS store the data in /cvmfs and Nix store them in /nix, which of course do not match.

There are tricks to make stuff working well, chroot, mounting, etc… but they are requires some level of complexity that will need to be managed on the client side and since we will end up having many more clients than server, I thought it was better to manage this on the server side.

Nix supports using a different storage directory beside /nix, but this comes with some difficulties.

First, you will need to compile Nix itself, which is not super friendly, but overall doable.

The real issue is that using Nix in a different storage path than /nix is not widely spread in the community, and for good reason. Unfortunately this means that I found few issues with some recipes not being completely aware of the possibility of using a different storage path.
(All my small contribution to Nixpkgs are actually fixing these problems)

Then, you will need to recompile the whole software stack, starting from the compiler and the stdlib, which means that you cannot use the cache from Nix, and it is just very slow, making iterating with problems not super smooth.
(My lack of experience with Nix didn’t help neither!)

So, to sum up:

  1. It is a great idea!
  2. It is doable!
  3. It requires some effort!

I hope I answered your questions, but if you got more question, feel free to write me.

4 Likes