UX problems with Flakes and custom caches (substituters)

Nix Flakes provide a great way to publish and maintain packages outside of nixpkgs, making it easier to “federate” the Nix experience in a sense.

The Problem

One major downside of publishing a custom flake (outside of nixpkgs) is no longer being able to take advantage of Hydra to cache your flake’s packages.

To work around this, the most common approach is to setup a custom cache, often with cachix (thanks @domenkozar!!!).

While it’s easy to setup a custom cache, it is difficult to make that cache accessible to users in a pleasant manner. In order for users to actually use your custom cache, they need to explicitly trust it.

It is important to note that the Hydra cache is special and does not require this extra step, as it is a trusted substituter by default across all Nix installations.

A user can explicitly trust your flake’s custom cache with one of the following approaches:

  1. Update their Nix configuration to include your cache as a trusted substituter.

  2. Interactively accept the substituter from extra Nix configuration (nixConfig) provided by the flake itself.

Both of these approaches currently have some major usability issues:

Problems with Updating Nix Configuration

  • Requires loudly documenting this extra step within a package’s README or installation guide. If the user doesn’t happen to read this part of the documentation, they can end up spending hours waiting for your project to be built from scratch.
  • Documenting this step clearly is verbose as the process of updating Nix configuration differs between NixOS (requires updating your NixOS configuration) and other platforms (requires updating /etc/nix/nix.conf).
  • Adding a cache to your trusted substituters in your Nix configuration means it will be checked for all packages and not just the package for which it is relevant. This is problematic as substituters appear to be checked sequentially, and each one takes multiple seconds before timing out. As a result, if you have more than 2 or 3 substituters listed in your configuration, in many cases it can take vastly longer to check for a package in your substituters than it takes to simply build the package from src, defeating the whole purpose.

Problems with Interactively Trusting Substituters

To clarify, this refers to the feature where Flake providers can specify a nixConfig attribute that lists extra substituters in the flake itself. Doing so means that when a user uses your flake, they will be interactively prompted to accept these new substituters.

  • This feature is still relatively new and severely under-documented. E.g. its existence is briefly mentioned in the Nix flakes wiki, but there is no commentary on how it works, how a user’s end configuration is resolved, or how any of the ambiguities around trust are handled. There are loads of Nix configuration attributes that make no sense for a Flake to specify on the user’s behalf.
  • The interactive process breaks the declarative workflow. nix build no longer either works or doesn’t, instead a user must check their console for interaction. This makes automated workflows (e.g. in CI) that use the flake more complicated to setup.
  • The answers provided by the user must be stored somewhere, which means the build process for each flake becomes stateful. It is not obvious to the user how they can revoke access if they previously accepted, or give access if they previously denied. It is not clear whether or not these values are stored for the same flake when referred to as an input from other packages.
  • Even if a user accepts the proposed configuration, there is a high chance that their acceptance is ignored without explanation. This is because by default, users are not a part of the trusted-users set, which is a requirement for the user to be able to accept the cache. Not only is this requirement not the default, but it is unclear to users whether they should add their own user to trusted-users. E.g. is it safe? What if I only want to trust my user for this package and not all others? See this issue for details.

Potential Solution - Cache Lock / Committing binary hashes?

I wonder if there is an alternative way to allow Flakes to specify trusted binary hashes?

E.g. perhaps a flake could optionally commit a table of input hash → binary hashes for each platform?

Perhaps this could be a part of flake.lock, or some dedicated flake.cache-lock or similar?

The idea here would be to remove the need to interactively trust caches or touch local configuration, and avoid the need to fully trust caches and instead allow for validating binaries against a set of known hashes.

All that said, the workflow in maintaining such a lock for multiple platforms could be tricky. It’s simple enough to imagine how a flake.cache-lock might be generated/updated as a part of nix build, but updating these hashes for alternate platforms would likely be a lot more annoying (e.g. projects would need to come up with some way of performing these updates and generating these commits from other platforms on CI).


Please let me know if I’m missing some obvious, nicer approach to providing a nice user experience around flakes that provide custom caches! I’m still relatively new to grokking the details of Nix and providing custom caches, so there’s a good chance I’m missing some obvious insight.

Any thoughts/advice on this appreciated :pray:

3 Likes

I personally don’t understand the threat model that these option confirmations want to prevent.

If you trust the flake source, which can really pull any source around the internet, you also trust the person providing the binaries source.

Preventing from binary cache to get poisoned is something we have to work against, but I don’t see how this step helps at all.

1 Like

This prevent any user on a multi user system to use a binary cache that could provide fake derivations with the same hash used in official derivations, and let people to escalate privilege or install a rootkit.

I’m currently working at improving the error messages for this situation, but this won’t solve the underlying bad user feeling (or user interface?)

1 Like

Could you describe the threat model and how this feature prevents the threat?

1 Like

It should be possible to evaluate/build an untrusted flake without having to risk that it poisons the local Nix store via malicious binary caches. Building an untrusted derivation is fine because it’s sandboxed, and similarly you can run the resulting closure in a sandbox. But the system can be compromised if Nix has pulled in a compromised dependency from a bad binary cache.

The ultimate solution to the configuration problem is CA derivations, since they don’t require substituters to be globally trusted.

2 Likes

The threat model is an user who is setting up a rogue binary cache providing fake derivation with malicious content, using same hash as packages from cache.nixos.org, allowing to override packages for an incoming system update with their version.

The confirmation doesn’t help in anything with regard to the threat, because you can’t use the substituter.

1 Like

I think the long term solution to this problem is CA derivations and something like Trustix.

2 Likes

I agree with this and don’t object it at all.

This is where I don’t understand the threat model, if you’re fetching a flake from an untrusted source, then how do you know that source is not doing something malicious?

If the binary cache becomes poisoned at some later time, I don’t see how asking the user for confirming to use the binary cache will help, it’s just trust on first use.

Maybe I don’t understand the threat model, that’s why I’m asking for the scenario.

2 Likes

A substituter could “just” reply to any given drv hash, with a plausible NAR which contains a malicious binary. (Simplifying the substitution process here as the details aren’t relevant for what comes next)

This binary again could be made available by linking it with many well known and often used names, such that it is likely to shadow something and gets called by accident when trying to enter a nix develop or nix shell.

Candidates would be what usually is in coreutils/busybox and perhaps a the big 3 shells…

This is the attack vector that I came up with, without actually doing deep thinking about it.

1 Like

How do you prevent someone from doing this as part of the build, not using the binary cache?

2 Likes

You can’t do that locally because the nix daemon is responsible for computing the hash, and as the hashing algorithm is safe from collision, you can’t use different inputs to produce the hash you want to match a well known derivation.

One can not “spoof” the hash of some central derivation that way.

Since it is relevant to the discussion at hand, and because it is not exactly obvious to newer users, I’ll expand just a bit by saying that CA derivations + Trustix would basically solve this because we can independantly verify the contents hash, and the more folks who have signed the package with a given hash the more confidence we can have that this package is indeed what it claims to be.

The real powerful piece though is the CA part. We don’t need to “trust” that the derivation is not corrupted at all, we can simply rebuild and verify its contents ourselves, at least if we can get the build to be truly reproducible. So I guess there is a thirst piece really:

CA + Trustix + Reproducible Binary

But even without the last part, we can still verify the contents of a pulled binary against its claimed hash ourselves, and if we trust the users who have signed off on it, we can at least assure ourselves that the contents has not been maliciously changed from under us.

That works as long as the package validating the CA hasn’t been tampered.

It reminds me a lot guix challenge command used to locally verify if a remote substituter is providing the same hash.

well its just a content hash right, so theoretically we can use external tools to verify the closure if necessary and we are really paranoid about it.

Oh I should probably clarify that by CA I mean “content-addressed” derivations and not “Ceritifcate Authority” :sweat_smile:

1 Like

CA + Truxtix + Reproducible Binary

Thanks for sharing @nrdxp! Could you elaborate a little on the role that Trustix would play here? Is Nix’s CA support alone not enough to pull off something like this?

Is it perhaps the case that Nix’s CA feature is limited to enabling content addressing, but there are no plans to actually enforce it, so as a result something like Trustix would need to be used? I’m curious where the line is drawn there.

As for the reproducible binary part, I’d imagine CA will at least help to check this but otherwise it’s up to Nix’s sandboxing to kind of enforce this no? I guess it’s tricky to fully rely on the sandbox disabling access to all kinds of side effects. E.g. can we even disable system RNGs? Curious what the sandbox limits are here.

Trustix provides a distributed trust model where a user can independantly build and verify the hash of a given derivation and then sign it. I haven’t played with trustix much yet given that CA is still experimental (as is Trustix for that matter) but it essentially provides a mechanism for cooberating this process so that many users can all independantly verify and sign off on a given package. Of course that trust is gonna be a lot stronger if the derivation in question is entirely reproducible at the binary level.

1 Like

The artifact itself can be trusted with the CA, though how do you know the hash before building?

Thats what we need the trust for, to map from the input based address to the content address.

1 Like

Ahhh, I was thinking the plan might be to eventually include the CA in the flake.lock, similar to how we store the narHash in the flake.lock today.

I guess this would not be practical as:

  1. it would require being able to successfully build the package before the flake lock could be updated and
  2. CAs will be different based to the target platform.

Looking forward to seeing what comes of Trustix :slight_smile:

I can not include the CA of my outputs in the lock file reliably.

As the lock file, or git metadata might affect the content of the path.

So As soon as I commit the lockfile with my CA, the git-sha changes, which would result in a different version string, which would result in a different content hash, which would result in a different lock file which has to be commited which in turn changes the git sha which changes the content, which …

2 Likes