Secure sharing of nix store with containers and VMs

Old habit… I pipe everything I can through jq these days ^^;

It’s not only secrets like passwords, keys etc. that are sensitive. Any information about configuration, network infrastructure, list of packages, services etc. can help an attacker. For that reason I’m seeking an effective solution that gives each “client” (container or VM) of a shared nix store access strictly only to packages from its closure and not one bit more.

Solution with bind mounts (or virtiofs mounts in case of VMs) seems to work but I still don’t like the fact it requires so many mount points. Wouldn’t it be cleaner and more resource friendly (and probably more performant) if there was a simple read-only proxy file system that would just delegate reading operations to the real file system according to configurable map of paths?

If such a file system really does not exist yet for Linux and there is no better solution than creating thousands of mounts for each store client I could take a look to see what it would take to create some simple version of it with FUSE. It would not be as performant as kernel driver due to context switches but it would work and in case it would turn out to be practical maybe someone with kernel development experience could then create proper VFS kernel driver replacement?

I’ve started with Nix just a few weeks ago so it’s only logical I am missing something :-). Please, let me know if such a file system would be of help in your point of view or if there is something hard hiding its implementation that I am not seeing yet (and which is probably the reason it does not exist yet).

3 Likes

Good point, I suppose.

That’s effectively how bind mounts work anyway afaik. I don’t see how this would get you anything except an extra layer of indirection you didn’t need.

Nix actually does exactly this for the build sandbox. It does the build in a chroot with only the build input store paths bind mounted. So you can’t e.g. enumerate the full nix store in a build; just the paths in your inputs.

2 Likes

Bind mount are not that cheap to setup, If you are willing to dive down the rabbit hole, landlock or other emerging security frameworks can help you.

1 Like

FWIW, I’m writing something like that for VMs.

In short, it’s a virtio-fs daemon which exposes a subset of /nix/store (or any other FS tree one cares to, I guess?) ; it just takes the path to the FS root to expose, and a list of files/directories in that tree to expose (like closureInfo can produce)

The end result is a (virtualized) read-only file-system that contains the VM’s environment (and its closure) mounted under /nix/store.
virtio-fs’s protocol is basically just yeeting memory pages with the desired content, so there’s no copying involved:

  • the daemon mmaps the files the VM requests, and transfers the pages to the hypervisor (and then it gets to VM) ;
  • reads from the VM directly hit those pages, the host’s kernel populates them as-needed without going through the hypervisor or daemon ;
  • the guest kernel doesn’t copy that data to its own page cache thanks to DAX (“Direct Access For Files”) so there’s no overhead in memory usage either.

It’s not released yet, but hopefully soon.™

PS: My main goal is basically to get UX similar to that of config.containers (incl. quick (re)build, etc.) and very-low overhead, while providing much stronger isolation:
no exposure of the whole store, using a modern hypervisor designed for security (cloud-hypervisor: it’s written in Rust and sandboxes its various components, like virtio device implementations), etc.

4 Likes

Anyone got any updates on this?

Having VM’s built with ‘nix build’ being able to access the host binaries and configuration is a major issue for me as well.

The guest should be isolated from the host filesystem and not have any access or visibility over any files of the host. Being able to execute binaries from the guest with files that are in the nix store as consequence of being installed on the host or other guests from that host is quite a severe system violation from my point of view.

Or is there any benefit or even need otherwise?

Regarding executing binaries, what’s the issue exactly? The same exact binaries could just as easily be produced in the guest if desired, in most cases. (Unless the nix code/sources used are secret.) They’re not setuid or anything. They’re just executable files that happen to be on the filesystem. That shouldn’t violate any security unless the content of the binaries themselves are secret, which doesn’t sound like what you’re talking about. Unless your system is EXTREMELY locked down, any user could curl binaries straight from the internet and run them. I don’t see how the binaries in the nix store are different.

From my point of view it’s pointless to discuss in this context hypothetical vulnerabilities as result of access to packaged binaries from the host OS and distracts from the main issue which is about the possibility to isolate the guest VM’s from system host files. I’m happy to take that particular discussion somewhere else if anyone is up to it.

Maybe not to everyone, but it seems that it is desirable for at least part of the community that the guest VM’s are not able to access files that are part of the host system. This includes not only binaries, but also configuration files such as for example the configuration.nix of the host NixOS or other guest VMs, as well as the firewall configuration file and so on. Even if doesn’t seems like a big deal for some, I’m sure that it is for other.

2 Likes

The problem I indeed see is, that someone with access to VM A inspecting the shared store could deduct configuration and/or availability of other “guests” as well as the “hypervisor”/host system, which is indeed a concern.

The very same reason is why we moved away from shared hosting services already years ago!

Yeah, that concern makes sense to me, though it’s importance is obviously context dependent. I’m just having trouble seeing the issue with the ability to execute the binaries… Anyway, I agree this topic should be about how to do it, not why it’s needed. Sorry to derail.

1 Like

I was testing a workaround which consisted of using the --store ‘/somePath’ flag for ‘nix build’ when building the VM. This would allow me to identify the necessary paths with ‘nix path-info’ and then I could delete whatever else I’d want to from the ‘/somePath’ nix store. Finally, I’d edit the resulting ‘./result/bin/’ script to change the mount point of ‘/nix/store’ to ‘/somePath/nix/store’. I think this should work in principle, but I’ve stumbled on what I think it’s a bug from nix as the ‘/somePath’ nix store creates symlinks to ‘/nix/store/’ rather than to '/somePath/’ although ‘/somePath’ has the files that it should be symlinked to. It is not possible to alter the targets from the symlinks in ‘/somePath’.

  1. Would anyone not see this ‘nix build --store /somePath’ behaviour as a bug? I can’t see why the new store would point to the original/default store

  2. Do you think this would work?

  3. Any other suggestions or alternative methods?

It’s correct behavior. Even when using --path, the build processes and the expected runtime usage works under the assumption that the store would be mounted on /nix/store. It just assumes that what’s currently at /somePath/nix/store is what will be mounted there. And this should be fine for you, so long as /somePath/nix/store is actually mounted at /nix/store from the guest’s point of view, making the links resolve correctly in the guest.

If you want the actual, canonical location of the nix store for build time and run time to be something other than /nix/store, you need to a) recompile nix, b) give up on using the binary cache at all, c) probably fix some bugs, since this usage path is pretty overgrown with weeds. I tried getting it to work a while back and gave up, but it should be possible, in theory. I’ve heard of others getting it working.

1 Like

Thanks for the explanation. However, it fails on the guest for some reason which I think it is because of the symlinks that point to the ‘/nix/store’ from the host.

When the guest is loading on stage 2, it errors with ‘systemd: Unit default.target not found.’ and when I check for the default.target on the ‘/somePath/nix/store/’ it is there but as a symlink to a path in ‘/nix/store/’. None of the sage 1 files are symlinked to the hosts ‘/nix/store’.

The links should be resolved in the guest, so if they point to /nix/store, and the host’s /somePath/nix/store is what’s mounted there, then they should point correctly into the same store, even if they look broken on the host.

Unless the method you’re using to mount the host’s filesystem into the guest is pre-resolving symlinks on the host side for some reason, in which case you need to figure out how to stop it from doing that. It’ll mess up a lot of things.

I’m not doing anything fancy. Just changing the mount path for the qemu vm run script for the nix store. Stage 1 works fine which means that the mount works and the files are found, but then stage 2 fails with the error above. The only difference I could find was the one I mentioned about the symlinks. Not sure what else to look at at this stage.

Well stage 1 runs out of an initrd, so I believe it wouldn’t mean anything about whether the mount is working as intended.

with initrd being in the ‘/somePath/nix/store’

It feels like the method above gets the process near from the desired state of having a separate store for the built VM’s, but it fails on stage 2 of the guest VM boot process when it tries to access some files from the store (it still seems to me that this happens due to the files being symlinked to the hosts ‘/nix/store’, but couldn’t confirm 100%).

I’ve moved on from this path as I wasn’t able to progress any further.

We just released a project that does exactly this. Images that only reference Nix packages but does not contain them, then the container runtime is extended in a first-class way to bind mount all the packages into the container and nothing else.

See: Nix-snapshotter: Native understanding of Nix packages for containerd

2 Likes