Reconsider reusing upstream tarballs

You can do the autoreconf in a separate derivation if you want to, in any case we should always fetch from source, and we should verify that the tarball matches the given commit, by calculating tree hash, fetching specified commit data, and checking that they match. The server can make up an invalid commit by using invalid metadata/parent commit, but to do this for a chosen commit (hash) would essentially be finding a valid preimage for it with a specific structure, not unlike what Bitcoin does but essentially with difficulty set to maximum since you control all the output bits of the hash, with the nonce being commit metadata and the tree hash being the block you want to include.

It isn’t perfect but it’s better, ensuring that git clones give you the same source code as the tarball.

FWIW whether this would have solved the xz issue entirely doesn’t need to be true for this to be useful, closing just one vector of attack at minimal cost is concretely useful.

Of course SHA1 (used by git for hashing) is broken but eventually it will use SHA256 and it’s still better than nothing.

If I have time I might do it but working on Nixpkgs is a pain.

1 Like

Maybe, let’s work more on the top of actual code rather than too much discussion.

Here’s something I have been thinking to do, in a WIP format:

I believe the TODO list is to:

(1) provide a diffoscope version for developers so we can analyze the diffs
(2) provide “normalizers”, e.g. normalize for line endings, normalize for PO translation files, etc, etc. We are bound to have divergence between release tarballs and git sources. Which normalizers are acceptable is an interesting question, ideally, normalizing should not increase the chance of hiding executable code in the wild, but let’s say if someone hides a binary in the “processed” .po translation and load it, it’s annoying.
(3) provide “reproducers”, e.g. ways to reproduce certain generated files — I know that autoconf will leak the exact versions in the generated files, this is a challenge to overcome, not a big problem. For example, .gmo could be reproduced. Then, it also becomes a reproducible build situation too.
(4) sprinkle that slowly and surely to build confidence in our bootstrap chain.

I won’t be able to finish all of that by myself, and I am pretty sure this will much more tractable with focused community efforts and upstream collaboration as I have heard that upstreams such as autoconf would be open to work towards reproducible builds for their stuff or moving away from m4 macros.

7 Likes

For people wanting a diffoscope output: /nix/store/746sj2mdjma4cfz1gq9mh1wymsrqnapi-diffoscope-261/bin/diffoscope --new-file --html-dir /nix/store/2zwg8r96if9i20z677b1pdz3wkrqncr3-check-source-equivalence-xz-vs-xz-5.6.1.tar.bz2 /nix/store/6w6yn8xs3pglswlzvm2i2cn3wd4jvrr3-xz tested/xz-5.6.1 here’s a quickly hacked one (notice the Nix induced noise with the perm bits!).

A release tarball can contain arbitrary data; there is no guarantee that it corresponds to the code in the tagged revision for that release as was demonstrated in xz. That makes it almost a no-brainer for me.

In the normal case however, this amounts to some pre-generated configure script and other autotools baggage.

One major aspect of this topic that I think is commonly overlooked is that this affects reproducibilty: The release tarball contains artifacts (not code!) which are generated from the actual code in an entirely non-reproducible manner.

I think it is worth discarding release tarballs for this reason alone because I am of the opinion that all artifacts should be defined in the terms of a Nix derivation. Reversely, the only fixed-output inputs to a derivation should be code produced by humans that cannot be generated automatically and can therefore never be reproducible.

Related discussion: btrfs-progs: build from source, not release tarball by Atemu · Pull Request #124974 · NixOS/nixpkgs · GitHub

Some further discussion of using git sources over release tarballs:

Pro:

  • Harder for malicious upstreams to hide malicious code not present in git
  • Every theoretically reproducible aspect of the source can be reproduced using Nix’ r13y guarantees
  • Trivial to use intermediate src revisions
  • Trivial to substitute the src using the tree of source files in any format. (Producing the exact same tarball as what the upstream generated on their machine is very hard, even if you had the exact same source files it contains.)

Con:

  • Longer configure times
  • No data source for SOURCE_DATE_EPOCH. With tarballs, the newest file mtime would otherwise be used.
  • Possibly complicates bootstrap (xz is part of bootstrap for instance)
21 Likes

I continue to be confused by this focus on release tarballs versus repositories, when the argument seems to center around generated files. Is it not a common pattern to include autotools-generated files in the repository? I believe many projects do it each way. Git repositories don’t automatically protect us from malicious developers; they may even give a false sense of security. Diffing tarballs against the repository won’t catch something that was maliciously inserted into both locations. It seems like we’re groping around for some technical solution we can be seen to have done here, when the real problems are social: Are we getting pre-embargo notices of vulnerabilities? Are we doing enough audits of critical packages? Are our users properly aware of the security consequences of using the unstable channel?

3 Likes

I am sorry but the perception I get out of your message is “this problem is social so let’s not do any technical solution”, I assume this is not what you meant. If you meant that, I will simply say that in cybersecurity, we usually try to raise the cost of the attack. Checking source equivalence systematically raise that cost to perform a stealthy attack because you are forced to systematically push it in both locations as you remark it, this is a step forward, in the good direction.

On what you say @ social problems, I don’t disagree, but people are usually better at solving closed problems than open-ended problems and make small but certain steps.

Pre-embargo notices of vulnerabilities? Discussed in large extent in this forum and among the security community of nixpkgs (I suppose you already read all the discussions), could you explain what do you believe this would buy us, and what changed since last discussion on that topic?

Doing enough audits of critical packages? Probably not. What is a satisfying audit of a critical package though? Just staring at assembly output? Diffoscoping binaries across distros? Putting the binary in a sandbox and looking at auditd logs? Yes, of course, we are not doing enough but that’s not a very actionable proposal I’d say :stuck_out_tongue:. For example, I cannot come up with an automated audit for xz which would have detected this thing apriori, so I’m not saying this is not worthwhile but I think reacting to existing threat and ensuring we don’t fall in it again is better than doing nothing but analyzing the whole class of attacks and providing comprehensive solutions is even better albeit harder!

Are our users properly aware of the security consequences of using the unstable channel? What does that even mean?


I think the biggest (social) problem out of this situation is the fact that we don’t have proper vulnerability tracking & remediation platform for the whole ecosystem, it’s a bit of a shame because this is useful for communication, boring discussions with stakeholders, etc. ; shameless plug, I have been working on that: GitHub - Nix-Security-WG/nix-security-tracker: Web service for managing information on vulnerabilities in software distributed through Nixpkgs but it takes time and efforts and I have been recently being more tired than usually and close to the burnout line. :slight_smile:

9 Likes

How does doing this raise the cost of an attack above rerunning autotools regardless of the source, which seems like a much more straightforward approach requiring much less maintainer involvement?

I’m not saying ‘let’s not do anything technical’; I’m saying focusing on the release tarball thing specifically is an example of the streetlight effect. It zeroes in on one distinctive part of this particular attack—an attack that doesn’t seem to have impacted us, and if it turns out that this is wrong because there is more malicious code hiding in the source, diffing tarballs won’t do anything about that! There are other things that this attack is much more likely to have in common with the next one, and those areas are where we should concentrate our search. If those areas have technical solutions, let’s implement them by all means! But to my eyes at the moment, they seem to be non-technical.

It would buy us more time to roll out patches. I don’t know if anything has changed since the last discussion, but it’s an area I would put spare person-hours into improving.

We share a lot of packages with a lot of other distributions, many of which are backed by commercial entities with professional security obligations. I imagine there are ways we could take advantage of that. I don’t know exactly what happened to the CVE roundup threads I used to see here, for example; I imagine that they were run by a volunteer who got burnt out or something like that. We could resurrect that and allocate some foundation funds to keeping it running, perhaps, and maybe that’s what you’re already doing. That may or may not be the lowest-hanging fruit; I don’t know. But it’s an area I would put spare person-hours into improving.

It means that over in the other thread are a bunch of people astonished that it’ll take a couple of weeks for the xz rollback to ripple through our builds and land on their servers. Those people should probably have had their servers running on stable channels, where software lags the bleeding edge by several months, and issues with new releases (including security issues such as this one) are typically long resolved by the time they upgrade to the next release.

2 Likes

So, the situation is that we have source A and source B for program D, sometimes we cannot obtain source B in the graph of dependencies, because to obtain source B, we need program C and C is not available at the time we are building program D.

A way to sidestep this issue while not having to blindly trust any release tarball is to perform source equivalence verification, e.g. to verify that source A is equivalent to source B after source A has been built into program D and that program C is now available.

This is not about rerunning autotools, this is about a general idea of sidestepping the bootstrap-induced difficulties of trusting various source files, some of them may be auto-generated, some of them may be irrelevant.

Rerunning autotools is really focusing about an example and not taking the general class of problem related to that.

On what is likely to be shared by the next attack, I guess we differ on the opinion on that, what I’m saying is that we can afford to explore multiple avenues, if you feel like we should focus on something else, there are plenty of folks who can help on that, but I’m unconvinced that other distro will do something about this and the next attack won’t share similar components.

BTW, I don’t think we should stop there, the attack also had mechanisms to disable some features, Debian folks had already a solution to this more or less: Debian -- Details of package blhc in sid.

I don’t see how this can get improved by putting spare person-hours here. You need private infrastructure and significant means to build patches privately, our security team usually get the pre-embargo by just watching the other distro breaking it because they don’t play by the rules.

For proposals who are directly related to the security team, I recommend talking about it with them, because I’m not security team and I don’t think it’s a realistic proposal given the current situation and nothing has changed IMHO, if anything, we just have even more troubles with our build infrastructure and “more spare person-hours” are not the solution usually, it’s a certain quality of “more spare person-hours”.

How are you certain that the CVE roundup threads were useful? It seems like to me, you are throwing what you think are good ideas, not saying this is bad per se, but I am failing to uncover an inherent reasoning behind your proposals besides “they look like good idea, but I don’t know their actual effects”.

BTW, I guess you didn’t read my post on the security tracker, because this is the modern manifestation of those ideas while addressing the low-technology issues that come with it.

Oh yes, I totally agree with you on this one. So here’s a thing I have been wanting/thinking: this is a UX problem.

We do not have a way to communicate to users:

  • You have urgent critical security upgrades, please upgrade
  • A critical security upgrade will appear in N days, they are in the “build farm”
  • Predicting the N days from all the enormous data we have

I have put pieces towards that: the security tracker, pulling all the PRs data, trying to get work done on label-tracker, etc. If burnout was less a problem, I’d have put together a GSoC proposal to work on “predicting channel arrivals time for a nixpkgs PR”.

We also do not have easy ways to offer the user:

  • You cannot wait? You can attempt to graft that security issue, here’s the magic code / command
  • You have build farm? Here’s how to force the application in exchange for a mass rebuild

etc.

Unfortunately, I see a lot of discussion, but I don’t see a lot of folks putting work in that. I even have a budget for some of these things, but it’s hard to find the right set of competence in this community and once you find them, they are usually busy doing something else also very important…

4 Likes

Don’t we already solve this problem in a different way, though? How does the first compiler get built—I am not at all aware of the Nixpkgs-specific details, but at some level it has to be that we pull in a bootstrap compiler binary from somewhere and use it to compile a new compiler binary? Why wouldn’t we use the same mechanism to pull in a ‘program C’ binary, or any other binary we need for stdenv but won’t have until stdenv is built?

If you also want to implement post-facto source equivalence on top of that, I again am not sure exactly what that would prevent. If we solve this problem using bootstrap binaries in some instances and source equivalence checks in others, it seems to me like that’s more complexity than is warranted.

But I mean, I don’t know. You’re right that I’m just spitballing here when it comes to alternative proposals. The most constructive thing I can contribute is to say that your approach doesn’t strike me as likely to be the most maintainer-time-efficient approach based on the arguments here, for whatever weight that’s worth. If you’ve spent more time in the weeds than I have and you think I’m missing the point, feel free to conclude that your analysis trumps mine. If you want to do something now with the design you have instead of designing something that might be more maintainer-friendly but requires more discussion, it’s not like I can stop you.

I think making a semi-“RFC” issue in nixpkgs would be nice to track the progress and have a centralized place to contribute ideas to.

1 Like

I started work on the git-hashing experimental feature (see also this comment Obsidian Systems is excited to bring IPFS support to Nix - #65 by Ericson2314) in large part because of this source of thing.

Ideally all fetching would start with a source control signature (signed tag, signed commit, etc.), and we will simply traverse the Merkle structure from there until we get all the source code we want. Filtering can happen in subsequent steps.

When we start with a pre-filtered (or otherwise pre-processed) source code, we indeed the provenance. In addition to being bad for security, this makes it harder to interface with things like Software Heritage.

7 Likes

Tangential, but I’ve wondered if anything interesting could/would get built if common builders produced N additional outputs with intermediate artifacts (or maybe just checkpoints of the build dir?), one of which is probably the source at one or more points between unpack and build (maybe just diffs if there is more than one point).

Some of the value here is just having a standardized source output so that analysis tooling doesn’t have to unpack all kinds of src attr correctly. IDK if there’d actually be leverage, but it should at least support tooling that can ~see how configure/build transform the source and perhaps record, for example, which blobs in the source turned up in the binary.

(I’m not sure if there’d be too much of this for it to be a human-reviewable signal. I’m also not sure if it would even catch the case at hand, here, since I think I recall seeing that it used a bunch of head invocations to extract different parts of the binary blob.)

I suspect the extra storage is a hurdle to doing this globally unless maybe we can squint and imagine that doing whatever we can to attract/support/empower security researchers will pay dividends in our security in the long run.

1 Like

I don’t think this would solve anything, this would just be “passing on the blame” (just for clarification in this specific case it would’ve solved the issue from what I know of the situation but I don’t think this would mitigate the risk at all when concerning a malicious actor).

A core maintainer released tarball is essentially as trusted as you can get. The problem that happened was that root of trust was a malicious actor - which I doubt anyone can really solve.

5 Likes

Perhaps there’s a compromise between these positions where we look at whether we can programmatically surface how a canonical source and the tarball differ from each other, and how that difference itself shifts during a source update?

I.e., rather than taking a position on release tarballs, we evaluate whether we can increase the number of eyeballs that see this delta and especially any sudden changes to it?

I cannot do a micro-introduction to our bootstrap situation in nixpkgs, I would let you check the details, your proposal is equivalent to add some Git in the bootstrap seed, consider the consequences of such a decision w.r.t. recent work of @emilytrau on minimal bootstrap (which we are not using by default IIRC/AFAIK).

It is another layer of defense in depth. This is a source equivalence check, nothing critically functional, if it’s broken, we can skip it and move on.

I am dubious of the complexity claim. I cannot answer to that without comparing to a functional alternative (i.e. something that catch this sort of things), I see none proposed so far except modifying the bootstrap seed footprint.

Right. I am also ultimately pushing my opinions, what I’m trying to get at, is that this is an endless problem and won’t be solved by technical solutions.

Under the current constraints we are operating, my proposal is a supplement, modifying the bootstrap chain seems to me not something you add on the top of the situation but disturbs that set of constraints, decreasing the bootstrap seed cannot be achieved by increasing it with git.

So, I am immediately shortcutting to the next low-hanging fruit in my eyes: source equivalence. If you see another supplementary low-hanging fruit, I am all ears if that shed some light on my full mental model.

In the end, whatever we will come up is naught in front of a motivated attacker and at some point, technical solutions only buy so much if you don’t involve fields like sociology to analyze what’s going on here and how to address it, but that’s my personal opinion as someone who have spent a bit of time with sociologists on FLOSS environments, I could be wrong and missing obvious things. So the question I should ask you is what do you expect to achieve with maintainer time w.r.t. this problem?

FWIW, @JulienMalka rightly said (in another private discussion) that another trivial way to fix it is to integrate a post-bootstrap phase to rebuild the binary and perform binary bit-to-bit reproducible checks on the outputs, which would have not caught this xz situation because the build script is perfectly reproducible pre- and post-bootstrap.

1 Like

pkgs.srcOnly allows for this (within reason).

I have much less trust in release tarballs than in automatically generated source-code tarballs - even if the release tarball was generated by the core maintainer.

Why?

  1. A release tarball can contain anything and have nothing to do with the code in the repository. Assuming that the release tarball is what you expect it to be (e.g., autoconfigured source code without malicious content) is one extra step of trust that you’ll need to take.

  2. The release tarball can be generated by one person and you have no idea how many people have audited it, so you need to trust this single person (e.g, a core maintainer), whereas in the code repository itself, in principle, if you see that many people have made commits, you have some reassurrance that they might have checked commits before theirs or run/tested the code. Not 100% reassurance, but definitely not zero.

  3. How many people are monitoring the release tarball content? Probably much fewer people than the code repository. That’s why it’s difficult to put malicious code to the actual repository without being caught. All developers are using the same files, reading/editing them, running things locally etc, so there is a much greater probability of getting caught.

So, to me, using source tarballs that are automatically generated by GitHub (or some other trusted host) seems like a no-brainer whenever feasible.

Of course, this doesn’t solve all risks concerning malicious actors as they can indeed put malicious code to the repository as well. But that’s more difficult to do without being noticed.

If I was a malicious actor and was authoring a repository for a widely used software, putting the malicious code to the release tarballs seems like a very easy attack with much greater chance of success than trying to put it in the code repository.

Even if we can’t make all malicious acts impossible and there’s always the risk of trusting malicious actors, it’s good if we can make malicious acts more difficult (or eliminate the easy malicious acts) without significant added cost. I think using source-code tarballs instead of release tarballs does that.

10 Likes

This would be correct in a perfect world but again this is highly targeted towards the most recent attack.

It’s easy to say people will audit the code but in reality it is really not going to happen, let’s take a look wrt nixpkgs, how “easy” it is to get code into the repository, all you need is 1 malicious maintainer and you could essentially pump code into the repository relatively unnoticed.

As an example here, how many people who review actually check the source of every URL that is specified in patches or is there implicit trust being invoked here? Just adding since I recalled a non-malicious example of how little we probably know of our own systems, Discord's disable breaking updates script breaks if settings.json is invalid · Issue #206106 · NixOS/nixpkgs · GitHub so every person running that derivation is unintentionally also running a script that modifies expected upstream behavior transparently with close to little obvious signs.

And wrt to the current situation, this would have not changed anything, the owner was on vacation and basically trusting the other core maintainer fully, who had commit access and could’ve committed anything into the repository (the only difference is how visible the changes were), and again given they were the only 2 maintainers, how much people audit the xz repository on a regular basis? No one besides those 2. For example you had the Landlock bypass which was only recently reverted in that same repository.

What I’m really saying is that if you want to prevent the current situation (in the specific scenario that occurred), yes it will work. But that is only treating the symptoms and not the root problem. While the root problem is close to impossible to fix from our side and could really be considered an inherent problem of open-source.

3 Likes