Best practices for Nix at work

samueldr · March 25, 2025, 10:02pm

Are you representing Determinate Systems with this answer? While yes, what you said is all true, this is reading charitably between the lines, and does not match their carefully chosen wording.

Nowhere in the section (Follow the principle of least access) is it specified that the reason for their claims is the reasoning you state.

And the section is not well supported your argument either, since their solution to the upstream cache being “untrustworthy” is to use their product, a private cache as a service.

What we have is only a off-handed comment about the trust of the cache,

So I believe the question still stands: is this FUD to make their product seem needed for «organizations of any level of “seriousness”»[sic], or is there a deeper issue that Determinate Systems knows about, where the official cache cannot be trusted?

paperdigits · March 25, 2025, 10:09pm

Does it? They build from the source they obtain and they serve you the content of their private cache. Unless you also build and verify against theirs, then how do you know? And if you’re building your own, then why do you need them?

To me their argument doesn’t really hold weight, you’re trading trusting one thing for trusting another thing.

arianvp · March 25, 2025, 11:36pm

Are you representing Determinate Systems with this answer?

No. I dont have any affiliation with DS

arianvp · March 26, 2025, 12:05am

they advice using a private cache like flakehub. But any private cache gives you this security guarantee. There is no need for a deeper issue with https://cache.nixos.org to see it as a good idea to use a private cache for your builds. A lot of companies run internal caches. This is not some wild thing but good security practice.

arianvp · March 26, 2025, 12:06am

Flakehub Cache is just a cache. They don’t do builds. You build and then push to their cache. So you know what you’re pushing. I assume the signing is happening locally but I haven’t verified that. I know that e.g. cachix does remote signing which I like less.

phaer · March 26, 2025, 7:53am

I know that e.g. cachix does remote signing which I like less.

Cachix also supports local signing, it’s just opt-in:
https://docs.cachix.org/getting-started#signing-key-advanced
(Just a user who likes the service)

Montmorency · March 26, 2025, 9:42am

I’m still working on integrating flakes across projects and trying to see what designs work best. I’m just wondering how best to rationalize the suggestion to avoid flake-utils/flake-parts (which have made their way into a number of my projects) i.e.:

… but any time you introduce a flake input into your dependency tree, it’s then an input to anyone depending on your flake (and anyone depending on that, and so on). And so you should strive to eliminate unnecessary inputs when possible, and we think that these are generally not necessary.

And the suggestion:

So how many flakes do you need? There’s no hard-and-fast rule here, but in general we recommend publishing one flake per versioned thing, where a “thing” could be, for example, a versioned package like a CLI tool or NixOS configurations for a service or a series of Home Manager configurations.

I might even venture to say that when in doubt, just create another flake. You can always consolidate into larger flakes later if need be.

flake-parts seems to be like a consolidated flake that handles systems, modules, etc. in a layout that fits well together and also makes it easier to orient when integrating other people’s projects with your own as the project layout is more familiar.

nrabulinski · March 26, 2025, 1:27pm

But their cache isn’t private as in, on prem, per-user. It’s private as in it’s a black box from our perspective, but anyone can “push” to their cache by publishing a flake on flakehub. Your argument makes sense if you’re talking about a company, hosting cache for themselves. DS’ argument about flakehub’s cache being more trustworthy because it’s a black box and not provided by the foundation doesn’t.

lucperkins · March 26, 2025, 3:11pm

That is not how FlakeHub Cache works:

Publishing flakes to FlakeHub and pushing things to the cache are separate processes.
You can only push to the cache in CI and caches are available only on a per-flake basis (so if you push to the cache for flake foo/bar that’s separate from the cache for flake foo/baz).
There are no public caches; authentication is always required for pull access.

Whatever else is happening in this discussion, I do feel like a fair representation is in order.

nrabulinski · March 26, 2025, 4:03pm

Right, but that doesn’t change the fact that you have yet to address the point @samueldr is making - hydra also just builds the nixpkgs flake so whether your advice is to use cno or flakehub cache for nixpkgs, the outcome is the same since you’re building the same nix code

EDIT: this is why the „security” argument doesn’t hold. Unless your advice is to fork nixpkgs and manually bump it so that the changes are (at least more or less) vetted, but that doesn’t have anything to do with caching - as has been said in this thread

lucperkins · March 27, 2025, 6:25pm

Apologies for the delay. I was doing another webinar this week and got a bit in the weeds on that. While I do think that the issue of review processes and standards in Nixpkgs is real and vital, on due reflection I think that some of you are correct in saying that that doesn’t get to the heart of the issue, so I’ll elaborate a bit more.

Under ideal conditions it wouldn’t matter which cache you pull something from. A derivation has a store path and that store path is in a known cache because it was built by a properly functioning version of Nix, signed, and copied over. Now here the trusted builder part emerges as the key assumption. Because you could, in principle, have EvilNix calculate a store path from a derivation, insert whatever content into the Nix store, and then sign it and copy it over to c.n.o. And then unsuspecting victims use those contents in systems all over the world and bad things happen.

Fortunately, this has never happened. Instead, only a properly functioning Nix has ever built the things pushed into c.n.o., right? Hmmm, actually no:

github.com/NixOS/infra

Document why Lix is used on builders

NixOS:main ← NixOS:lix-expl

opened 10:28PM - 11 Feb 25 UTC

infinisil

+2 -0

This was introduced [without any linked discussion/explanation](https://github.c…om/NixOS/infra/pull/524), so let's document it now. I asked about the reason on Matrix and [was told](https://matrix.to/#/!RROtHmAaQIkiJzJZZE:nixos.org/$_zH2bGUSUChkNFNjxwpCeWL_OVa-9XELobMhsqCOinE?via=nixos.org&via=matrix.org&via=nixos.dev) the answer by @K900. Though I'm leaving this as a draft until we actually know what the segfault was. Nothing against Lix, but on the Nix Hydra we should really be dog-fooding Nix. Imo if there's a bug, it should be reported so it can be fixed before upgrading. Pinging @NixOS/nix-team in case anybody has a clue what the problem could be, and how it could be fixed. Note that evals on the coordinator still use Nix, it's only builder machines that are using Lix for now.

For some period of time, Lix, not Nix, was building packages from Nixpkgs and presumably copying them over to c.n.o. Did Lix insert evil contents into those packages and have them masquerade as valid store paths? Doesn’t seem that way. Was this a big deal for me using Nix in my personal projects? Naw, not really. Should a small company running a few NixOS boxes lose sleep over this? Probably not.

But at root, and @arianvp has eloquently made similar points, it’s a question of risk modeling. If I were a CTO in, say, a financial institution or an automobile company, it would give me a great deal of pause knowing that some subset of store paths in c.n.o., even a small one, was built by not-Nix. I would wonder if similar incidents had occurred in the past. I would take things like this into heavy consideration when adopting an internal Nix strategy. More generally, the fact that something like this was possible in the first place would have made me inclined to build on my own infra long before this incident happened.

I’m not bringing this up to throw salt on old wounds or to call anyone out. I have no quarrel with the Lix project or any of the people involved with this incident. But I do think it illustrates the problem rather colorfully. So no, @samueldr, we at Determinate Systems don’t have any special insider knowledge about Nixpkgs or Hydra. We only have the public record. In the future, though, we’ll strive to do a better job of spelling out our reasoning when it comes to controversial recommendations like this. Some of the criticisms raised here were quite fair in this regard.

samueldr · March 27, 2025, 6:42pm

I agree with what you have said.

But they are wording it in a way that, with the newly made available information, is meant to imply that the upstream cache from the foundation is not to be trusted.

This is a major distinction between

In some situations it is necessary and maybe crucial to own the full builds, on your own private cache

and

The upstream foundation cache cannot be trusted, so instead use a private cache

(and juxtapose that latter claim with their private cache product.)

And here lies the issue: this is again a bite-sized microagression trying to dismantle trust in the foundation, and not only that, pretty much shitting on the hard work of the volunteers involved, instead of participating in “improving” whatever there could be to improve, if it isn’t all FUD.

samueldr · March 27, 2025, 6:53pm

With all due respect, if this is an issue, bring it up with the people involved instead of dragging the whole project in the muck. I’m not here to judge if this was appropriate to do. But if the only thing you can say amounts to discrediting the volunteers maintainers of the infrastructure for a decision about how the infrastructure of the project is ran, you have nothing to say.

Casting doubts in this situation, especially in the tenuous relationship Determinate Systems has with the community, seems like a bad idea.

I sure hope the gamble of continuously trading all the leftover reputation with the community pays off, with the organizations of any level of “seriousness” you are aiming to get business from.

sorrel · March 27, 2025, 7:01pm

I find it quite funny that you’re trying to sow doubts about the cache purely on the basis of “something that’s not Official Upstream Nix pushed to it for a while”, whilst simultaneously maintaining your own fork of Nix.

Is there enough oversight of the infrastructure that builds nixpkgs and pushes it to c.n.o? By some people’s standards, probably not! Is Lix in any way related to this? Not really. I would frankly be more worried about paths that may have been built by a buggy-but-official version of Nix, myself.

delroth · March 27, 2025, 7:20pm

There definitely isn’t enough - look at the kind of stuff that was in charge of serving builder images for h.n.o for years (until the person accused of having made the infrastructure less trustworthy fixed it): add nix-netboot-serve · NixOS/infra@a59caa2 · GitHub . Some random github project owned by an untrustworthy for-profit corporation could have backdoored the whole NixOS build infra with no oversight!

(/s)

paperdigits · March 27, 2025, 11:06pm

Absolutely not, but if you can’t see that it is in extremely bad taste to do so while pushing your own black box product and linking the post in which you do so to the very community who built those things, then I don’t know what else to tell you.

Do you think the blog post and the comments here will improve the nix community cache?

Do you think that continual denigration of the community on top of which you build your products is a winning marketing strategy?

“Don’t trust these community things, trust out proprietary, black box thing!” Yeah totally make sense.

You lambasted quite a few things.

delroth · March 27, 2025, 11:59pm

This isn’t at all what people are annoyed about. I’m legitimately not sure whether you’re deliberately missing the point or just missing the point here, so to give you the benefit of the doubt, here’s a rundown:

You work for a for-profit company which has a giant dependency on the Nix/NixOS communities. Let’s be real: without a community project like nixpkgs none of what DetSys provide has any value to anyone. That’s true of basically every other for-profit company building on top of Nix (Cachix, Flox, etc.) - without the ecosystem and the thousands of unpaid contributors you might as well close down.
The community heavily depends on c.n.o/h.n.o. They’re existential risks to the project, it’s basically infeasible for most individual users to use nixpkgs without them (due to the cost of builds). Transitively, if c.n.o/h.n.o weren’t there, you’d lose enough of the community that it would stop operating, and see point 1 for what that means for your company.
c.n.o and h.n.o are expensive to run, they are core to the existence of several for-profit companies (see points above), but those companies don’t give back anywhere close to what they’re receiving for it. It’s held together by (fragile) sponsorships from companies outside of the ecosystem (e.g. Amazon) and (generous but insufficient) donations made largely by individuals. That’s also true of nixpkgs and other load-bearing community pieces that y’all don’t pay for, but it’s especially painful and noticeable for c.n.o/h.n.o where there’s a clear $ amount to pay.
Not only is DetSys in this case not contributing what it should probably be fairly contributing, you’re now explicitly telling people that distrusting the community’s work is a best practice, and serious people should instead pay you for storage and cache management. Meanwhile, the value-add of DetSys only makes sense in a world where the nixpkgs community exists and is healthy, and that’s deeply tied to c.n.o/h.n.o being healthy and usable/trustworthy. It feels completely hypocritical.

Now for technical points, because I happen to have worked on supply-chain security - both in the FAANG world and in the open source world. Your suggestion to replace c.n.o by FlakeHub is puzzling because the logic does not hold. FlakeHub can replace c.n.o’s S3 storage, yes. It does not replace h.n.o. But c.n.o is not a trusted component in the NixOS infrastructure - everything on c.n.o is signed ahead of time and verified after downloads. DetSys does not provide a viable alternative to h.n.o. FlakeHub is a glorified managed S3 bucket, not a build infrastructure. As far as I know, DetSys’s attempts at a hosted private Hydra service never went anywhere. Which means your whole argument is a non-sequitur: the alternative you’re proposing as a “best practice” isn’t “replace c.n.o with FlakeHub”, it’s “replace c.n.o with FlakeHub and make up your own replacement for h.n.o, you’re on your own lol” and hope that the replacement is in fact better and more trustworthy than h.n.o? How do you even begin to compare trust levels between h.n.o and something that doesn’t exist?

And this falls even more on its face when you consider that a chain of trust is only as fragile as its weakest link, and by far the weakest link for nixpkgs is not h.n.o, it’s the process of submitting changes to nixpkgs itself. With the limited time that NixOS security minded people have to spend on improving the chain of trust for the package set and the distro (because yes, there are people who care about this), improving the trust for h.n.o barely registers. It still gets discussed because nerds love easy theoretical problems more than realistic human problems like “how do we design a contribution review process that doesn’t let people backdoor stuff”, but I’ll reiterate - de-trusting h.n.o does barely anything for companies relying on nixpkgs (and de-trusting c.n.o by using FlakeHub does precisely nothing).

Hopefully now it’s clearer how your actions and your employer’s actions could be viewed as insulting from a community standpoint? And to be quite honest, from a technical standpoint I’m not even sure if feeling insulted would be the right response, it’s almost more puzzling how out of touch / off the mark your recommendations are. It’s a lack of understanding of supply chain security that I wouldn’t expect to show advertised on a corporate blog.

PS: running a second mirror of h.n.o operated by an independent party is also something that DetSys could completely just, like, do. It’s like a few $K/month to run a copy of h.n.o and verify all the hashes so you can guarantee that h.n.o is not compromised. It’s not even hard! Anyone with a few $K/month could do it.