My nixpkgs trust worries

I think that would be easier to implement and could help in reducing random dependencies.

It would still be a massive undertaking and a package that is badly maintained could also just be done, working and using a robust build system that does not break all the time.

So there are multiple things here, and I think they are a bit conflated. Tbf, this can be read as nitpicky and probably ranty. I do not try to make it so. I am trying to separate different things in the hope it helps make the discussion progress.

I think there are multiple aspects to consider in your question. There is Trust, Verification, and then a discussion of possibles attack and of constraints put on both maintainers and users.

I will participate from the pov of a user of Nixpkgs, a maintainer of some libraries for programming language environment, a person actively participating in ā€œSoftware Supply Chainā€ work, and as a maintainer/reviewer/owner of Nixpkgs subsystem.

First, Trust. People trust nixpkgs because the organization as a whole has demonstrated taking steps toward being trustable. In particular showing that we all care about reviewing (the amount of PR that gets stuck because not good enough is a good example), that we care about stability (we separate nixos release from the unstable release and limit backport, that we care about security (See for example the current threadnaught to remove python 2.7), etc etc. We do a lot of things that warrant ā€œsome trustā€. How much you need is of course personal, and link to the third thing, so I will come back to ā€œenoughā€ trust later.

Second, Verify. Whether you trust us or not is not really necessary for this. Being able to verify that the change coming in a new version are the one you want is a different capability. This is what you seem to hint at with the tools you are asking for. This is great but generates a second trust problem, due to the third thing I wanted to mentionā€¦

Constraints. The reality is that any advanced distribution or modern list of software to run is massive. Like we are talking millions and millions of SLOC. Even just nixpkgs is massive and you use far more packages and modules in even a barebone install than you probably think. Verifying everything you get isā€¦ simply impossible before the heat death of the universe. At least for you alone. So you end up needing multiple people to do so, and then you need to trust their report.
And you get back to square one. The sad reality is that verifying like this is simply not possible in any modern system. How much you will verify (partially, with a selection rule that can be random or targeted by danger or whatever else) is depending on yourself and how much you are scared of this attack. But more importantly, it will dependā€¦ on how much you can afford to verify.

So while this kind of tool is great, and indeed we have them. And yes, this kind of attack is totally possibleā€¦

The reality is that there is not a lot we can do about it. It is a problem fundamental to the way modern software work. It is this way for a reason, and there are no good solutions.

Are there things we could do? In general, yes, but then we hit the second big constraint of the Software Supply Chain projectsā€¦ Noone fund digital infrastructure, which this is. And this means that all the work is done by volunteers, which limits drastically how much we can ask from them. Hell, I am one and there are multiple things that would be ā€œneededā€ to make my packages and libraries ā€œmore trustableā€ that I refuse to take. Because this is a hobby and free time I give and quite honestly we are already burning out maintainers fast.

I am sure you read it, but to anyone that has not, I highly recommend going to read Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure - Ford Foundation

It is still as relevant today as it was in the past. And as mentioned above. If you are scared for nixpkgs, consider that Autoconf and openssl for example do not have any maintainer that can afford any of these basic demands you make here. The problem runs far deeper and quite honestly nixpkgs already does far more than most of the rest of the ecosystem out there.

7 Likes

My problem is that there is so much activity, itā€™s impossible to review everything going in. ā€¦snipā€¦ I wondered if other people shared these worries.

In addition to what @DianaOlympos wrote, itā€™s important to keep in mind that there is no end to how deep the trust rabbit hole goes.

Letā€™s say your verification tool shows exactly what changed between 2 builds of your system.

But do you trust the maintainer of the verification tool? All the various pieces of software that are used to build your unique system? The guys that wrote the possibly compromised compilers used to build said software? Every VCS hosting service used to host the source? Every data center employee in which this stuff runs with access to the hosts on which the builds run? The manufacturers of the hardware that runs the builds? The guys that wrote the firmware that powers the hardware? The trucking company that moved the servers from the factory to the end-user or the warehouse operator where the servers are stored? How about all the subcontractors used anywhere along the line that have access to slip something in someway or another?

And weā€™re small. Like super-tiny, insignificantly small. You want to go through the hassle of doing a covert, supply-chain attack against a distribution? Youā€™d pick fedora for the chances of it making it into RHEL where the juicy targets are or Debian/Ubuntu to reach the most amount of users.

I am not saying we cannot and should not do more to ensure the integrity of our process. We might even have some low-hanging fruit that will dramatically improve the current situation, but you will have to trust people and accept the risk.

So as you may have guessed by now, no, I donā€™t worry about someone sneaking in a malicious NixOS module change.

7 Likes

Thanks for all your arguments :pray:

Yeah, thatā€™s a good point. And even if distribution packages themselves are fully audited, one could attempt to compromise one of the thousands of upstream projects. Compromising a single widely-used package will give you a backdoor into all distributions, so why just focus on one?

I donā€™t want to name and shame projects, but I have seen at least one fairly widely used project where someone with a very short history with the project and seemingly appearing out of nowhere got commit rights. Many open source projects are very understaffed, so itā€™s fairly easy to become a contributor with a commit bit.

1 Like

when i first got into nix, i had the same questions, and i did some research.

I concluded that nix is no worse then anything out there alreadyā€¦ , from a packaging point you can build binary packages yourself from source, itā€™s not difficult to do. You can even run hydra and build everything yourself. Nix is not machine code, once you get it , you can see what itā€™s doing, patching, building, etc etc, I find it easier to audit than endless ansible yaml filesā€¦or bizzareo python build DSLā€™s like yockto.

Other distroā€™s i have to trust not only the base operating system gods, but the people who package the software tooā€¦ , I have to trust THEIR actual system security too. If they build a binary, and i use it, i have very little clue how that was built , or what the state of the system was when it was built. Unless i build it myself, and that is much harder than doing a simple nix-build on a nix/OS machine i can assure you.

To reduce commit bottlenecks to debian repoā€™s ā€¦ you start to add a lot of 3rd party repoā€™sā€¦ and as we all know adding more 3rd party apt repoā€™s , basically kills a Debian based distros after some time. Nixos is the only thing Iā€™ve seen survive this abuse! :-).

I only mention Debian as an exampleā€¦, but they all have this problem, all of them, because thats the way binary distributions workā€¦ you need a lot of trust based not only in the building of software, but a lot of trust in the packaging and distribution of software. Nix you only to trust the build recipe , and you can do everything else yourself if your having a tin foil hat moment.

So itā€™s swings and roundaboutsā€¦

Thereā€™s nothing to stop your forking nixpkgs, controlling and auditing every change. However that requires human effort, a lot of it, but itā€™s not impossible and thatā€™s what makes nix/OS so bloody brilliant.

If you got a enough human effort to managed your own nixpkgs, youā€™d have your own OS. You could set what rules you wanted, and let whoever you want be able to merge PRā€™s. However, it may soon diverge, to a state it can never be fed back into mainline. You have just given birth to a new operating systemā€¦this may be a bad thing or a good thingā€¦

A lot of the activity is not actually core operating system or infrastructure, itā€™s application bumps, or library bumps (not core ones either). Not everything is as critical as libc , the kernel or openssl.

The nixos-small channel i presume was attempt at thisā€¦ both to reduce testing time, and have a more auditable amount of commits coming through for lean and mean machines (servers).

This leads on to secure software bill of materialsā€¦nix believe it or not is best in class for providing this information, because nix cares a lot about itā€™s inputsā€¦

At the end of the day, security is a matter of trust. Nix needs more engineering effortā€¦ paid effort , to keep a well maintained and watered garden.

itā€™s going to be interesting how it pans out. history in the making.

Things will get more interesting a nix adoption increases , and committers go from 4000 to 400,000.

hmmmā€¦ interesting times.

I have to agree with you, itā€™s a rabbit holeā€¦ The other distros ask these questions as well, i wonder what solutions they have to ā€˜successā€™ and ā€˜contributor popularityā€™

3 Likes

What about using a Web of Trust like this?

Certainly reviewing packages one by one is impossible, but reviewing them socially is way more scalable

https://camo.githubusercontent.com/393488619f1eb9789edf622ec40de553d3bebdb954b1117b13c3e99d474cf823/68747470733a2f2f692e696d67666c69702e636f6d2f3562386671642e6a7067

3 Likes

Rewrite everything with algebraic effects?

1 Like

If this is true then someone should tell the kernel. They get more activity than we do and they review it well.

That said, they do have one key thing that we donā€™t: Subsystem maintainers. Now, I donā€™t know much about kernel dev, so I might be misrepresenting this, but as I understand it, the kernel has a hierarchy of maintainers who take ownership over specific subsystems.

I said this three and, in fact, eigth(!!!) years ago (although most of that email is not true anymore and I have other opinions today)! Nobody would listen/care. ā€œWe donā€™t have enough resources/contributorsā€ was the general reply back than, IIRC. Which was ridiculous back then and is still, because kernel style development scales from literally single-digit numbers of contributors toā€¦ wellā€¦ kernel size.

The former (link) is mostly still more or less my opinion on the topic, except for that we should do ML development.

That said, Iā€™m all for subsystem maintainers, but still with monorepo. We can have subsystem-repos where general development happens, but in the end we should provide only one repository where users get their truth from.
What I could think of: Subsystems = python packages, node packages, haskell packages, ā€¦ etc. are all single subsystems. Each subsystem gets >=2 maintainers that merge patches to their trees. They merge in intervals (maybe even every day, possibly completely automated) to mainline and thatā€™s basically it. I doubt anyone can argue ā€œwe donā€™t have enough maintainersā€ nowadays.

I have very little hope that anything will change though.


Besides: Have a look at the numbers in the blog article - absolutely rookie numbersā€¦ I mean ā€¦ 1.8k open PRs? Sounds more like some barely maintained distro, right? :sob:

2 Likes

We kind already moved to this slowly over time but in the mono repo. See python-updates or haskell-updates as an example.
There are already automated merges from master ā†’ staging-next ā†’ staging and at least once a week, sometimes more often or less someone needs to fix merge conflicts there. I think that would happen a lot more with more automated merges and someone could spend itā€™s day fixing merge conflicts.

Kernel development style with mailing lists gates contributions behind having an email setup. All graphical email clients donā€™t scale with our size. Thunderbird already chokes at my inbox right now and takes minutes to load. When you first need to learn and configure mut then someone new is busy for a week. Then you need an email hoster that is compatible and if someone sends an html mail, they will be chased out of town.
Emails canā€™t be edited, if I notice straight away that I forgot to fix something, then your inbox would be filled with a follow up mail, instead of some edit no one will notice. I frequently do that and I doubt that just reading another time through the mail would be sufficient.

Also the kernel has one giant advantage: if some unmaintained driver for old hardware does not build and no one notices it for 2 years, then thats not that big of a problem. If one nixos module does not evaluate then nixos-unstable canā€™t advance.

Also open PRs is not necessarily a good metric. If we merge 2.6k PRs a month then having 4.5k open is just a backlog for 6 to 7 weeks. Probably 2k of them are stale and people are just to affright to close them. So it is not really as a bad as it seems.

but I am not happy with the issue tracking or GitHub search. I donā€™t have a good solution for that.

Also self hosting is not really a good option IMO either because we would need at least one, probably multiple, persons to maintain and scale that full time. Right now we can just lay back and wait for GitHub to fix things.

7 Likes

Also some things I want to underline/add:

  • The current workflow is not perfect or optimal.
  • Do I have easy to implement ideas to improve that? No, not really Some, see below. All of them require good planning and acceptance otherwise they would cause more confusion, silos and potentially a ā€œsplitā€ of the community.
    • Maybe we could work with project dashboards and labels to improve the issue overview?
    • automated updated dashboards for PRs so that reviewers can easily find things that are ready
    • automated reviews with tools that parse AST to find common mistakes like missing pythonImportsCheck, meta, common errors in the log (no python tests run), etc
    • bors but only the part that merges the PR when CI checks are green. Automatic rebasing on master would be a major waste of CI resources with questionable to no benefit.
7 Likes

This would be very helpful, would go a long way to showing new contributors what is asked of them without having to feel like youā€™re wasting valuable maintainer time :slight_smile:

5 Likes

automated reviews with tools that parse AST to find common mistakes like missing pythonImportsCheck, meta, common errors in the log (no python tests run), etc

Now, that would be neat.

On an off-topic note, does anyone know of an overview of code in different languages for dealing with the nix AST? I know of rnix-parser in Rust, but are there others? Could be a very interesting project that would add some real quality of life improvements to the PR workflow.

3 Likes

Automated nixpkgs-hammering would be nice. A GitHub Action already exists for it.

4 Likes

I donā€™t see how that could cause confusion or split the community. We need more labels!

Labelling could actually even help the original issue.

My idea goes like this: The soonest a PR labelled as ā€œsecurityā€ can be merged is x days (e.g. a week) after creation.
This increases trust in the PR because itā€™s more likely to have gone past some eyes in that time frame. If there was something wrong with the PR, someone would have noticed and screamed. This wonā€™t work if the PR is immediately merged before anyone had time to apply the security label of course, so a general smaller time limit for all PRs would also be needed.
Since some security patches are time-critical, an ack of multiple committers could allow merges before that time limit since thatā€™s more than enough proof the right people have seen it. Merging huge security issues like the openssl vuln we had last year would be trivial to coordinate because the disclosure would be known ahead of time.

I think that has the potential to be very annoying because: What if the check is wrong?

What if it stays wrong forever because nobody bothers to fix it?

I think these are realistic scenarios.

Do you mean for cases where you have reviewed someone elseā€™s PR but ofBorg is taking its sweet time?

Because I also see potential for more aggressive self-merges here. Push and tell bot to merge after CI ā†’ merge breaking change because our CI is very basic.

That example canā€™t really, I was more thinking about when some parts move to a mail based workflow.

This is why we must think it through and make it right instead of rush it.

Homebrew used to be split up and I encountered this exact issue trying to contribute OpenJazz to the ā€œgamesā€ tap, as it had a dependency on libmodplug that was only in some other tap. Since then theyā€™ve abandoned that approach and reverted to a single large repo for all ā€œsupportedā€ packages.

That said, I think there is something to be learned from Debian/Ubuntu here, and thatā€™s that a split between main, universe, and multiverse is tenable. Main packages can only depend on other main packages, and universe packages can only depend on main and universe packages. So although the main component is very large, they have indeed succeeded in isolating a central pool of packages which are agreed to be more critical, and receive more attention/scrutiny toward their updates.

4 Likes

Any idea what happened to r-rmcgibbo? rmcgibbo seems to be mostly inactive recently, maybe we can bring it back under a different account?

1 Like

I share some of these worries, and I was thinking of habits to not unnecessarily expose myself to risks when running ā€œone-offā€ programs. One way to do this would be to run them as a separate user, so they donā€™t have access to my ssh, gpg, cloud etc credentials.
I was thinking of ways to do this using the systemd dynamic user concept, where systemd creates a temporary user for a service, just for that invocation. To do that, but in an interactive shell.

I found the following to be a useful starting point.

$ sudo systemd-run -E PATH=$PATH -E NIX_PATH=$NIX_PATH  -p DynamicUser=yes -t --send-sighup /usr/bin/env bash
Running as unit: run-u8078.service
Press ^] three times within 1s to disconnect TTY.

[run-u8078@nixos:/]$ # Shell belongs to dynamic user
[run-u8078@nixos:/]$ whoami
run-u8078
[run-u8078@nixos:/]$ # Example of command not available
[run-u8078@nixos:/]$ wget
wget: command not found
[run-u8078@nixos:/]$ # Enter a nix-shell to make it it available
[run-u8078@nixos:/]$ nix-shell -p wget
[nix-shell:/]$ wget
wget: missing URL
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.

Without adding the PATH environment variable the spawned shell has no useful path at all. When I added NIX_SHELL as well, I could use nix-shell. But I am unsure what would be the optimal way to initialize it, all things considered.

This just illustrates a starting point. It is possible to modify the environment further. For example, I can add
-p NetworkNamespacePath=/var/run/netns/my-vpn-namespace
to spawn the process into a different network namespace.

9 Likes