Nixpkgs supply chain security project

Updates on the Nixpkgs security tracker

We’ve reached the Github login and manual record linkage milestone end of August. As of today, delivery of the project is late by another 4 months, after the last plan for being late by 8 months. It’s a disappointing result – many reasons, no excuses.

I’m back to it now and take budget responsibility and full accountability for getting the system into a state that is usable by the security team by Friday 2024-12-13. Which means that if you have questions, ask me directly. I allocated at least 5 hours per week on active planning and direction.

Ping me if you want to join the private Matrix room for beta testing, where we already collected some initial feedback from people involved in Nixpkgs security.

@erictapen will work full time on designing and building the user interface for the planned workflows. @raitobezarius will improve the performance of the data ingestion and processing pipeline, @proofconstruction will support us with backend-related tasks, both a couple of hours per week, with assistance from @alejandrosame as needed. @Erethon will latch onto the NixOS infrastructure team to ensure we have the service deployed to nixos.org for demo day.

This week we have refined the user interactions to a degree where we can break down the implementation requirements into manageable chunks. The general idea is that a vulnerability record goes through three phases:

  1. Initial triaging

    Security team members get presented with a queue of pre-computed suggestions for matching CVEs to Nixpkgs derivations, which they would filter for things to inspect further. We already have most data to do a decent automatic matching, but it needs to be validated by humans, and we expect most of it to be thrown away. The aspirational goal is to make it convenient enough to ultimately reach “inbox zero”.

  2. Draft

    An automatic match will never be perfect, so a record needs to be adjusted. Many CVEs we get have duplicates, the contained information is often coarse or ambiguous, so the security team will have to check and possibly correct which derivations it applies to before publishing the record.

  3. Mitigation

    Once a draft is ready, it should be published as a GitHub issue where all affected maintainers are pinged. The usual workflow continues from there. Once the issue is closed, the vulnerability record is archived automatically.

You can follow progress in the milestones on GitHub.

We have a bunch of obvious refinements for these workflows in the backlog, such as smoother, structured searching and filtering for CVEs and derivations, and periodic notifications for security team members and maintainers, but those are currently not prioritised. We’ll focus on enabling the essential user stories for now.

On using GitHub dependency submissions

@sambacha Thanks for the pointer. I’ve investigated the possibility of offloading advisory handling to GitHub, and doing that can be at best considered future work.

There exist GitHub - tweag/genealogos: Genealogos, a Nix sbom generator (@ErinvanderVeen) and GitHub - tweag/nixtract: A CLI tool to extract the graph of derivations from a Nix flake. (@Arsleust) to generate dependency information that could be consumed by GitHub, but those can’t be used at Nixpkgs scale. The security tracker currently periodically evaluates all tracked channel branches and puts everything in a database. We could construct SBOMs or any required format from there and submit it within rate limits, but that’s essentially a different project altogether.

And then we’d still have an by far inferior UX to address advisories which, most importantly, doesn’t take into account how Nixpkgs is currently developed. This project is to serve the security team and package maintainers, and GitHub simply doesn’t support our current needs.

Last but not least, there are good reasons to keep our dependency to GitHub limited. Both STF and NGI are interested in us (and everyone else) adopting free and open source software for all workflows. Control over the supply chain is part of supply chain security.

16 Likes

We’re making progress. In the past six weeks we were fighting reproducibility issues, sickness, and various time sinks inside and outside of the project. @proofconstruction had to drop out, @erictapen iterated over a couple of designs, @raitobezarius supplied the data models and a staging deployment, @erethon made the first steps towards a deployment to official infra (thanks @Mic92 and the infra team!).

Two weeks ago we sent our beta testers an interactive demo for clicking around. Now the workflow for dealing with automatic suggestions to match CVEs with Nixpkgs derivations is morally complete, and we all have a much clearer vision of the look and feel we’re moving towards.

The main idea for the triaging is that the queue of automatic suggestions shrinks monotonically. Either a match is irrelevant, or it could be considered in more detail. If a suggestion is selected as a candidate, after removing irrelevant (and later maybe adding missing) affected derivations, it’s supposed to be moved to the next step in the process: a draft security issue, where one can adjust more details and eventually notify maintainers.

Note that we’re currently focused on defining workflows and visually structuring the available information. Fine-tuning colors, shapes, and proportions will happen later in the process.

image

We didn’t fully complete our objectives for the first milestone due to some unexpected delays and tricky details that needed to be figured out. What was originally intended to be “quick triaging” was cut down in scope and renamed to basic triaging. We still have to make accessible more data to show for at-a-glance decisions, the query performance is not suitable for real-world use yet, and there’s still a bug in the record linkage algorithm that creates countless duplicates.

The next steps will therefore revolve around making the triage workflow more practical – primarily by reducing the amount of noise and uncertainty – with things such as:

  • showing derivation descriptions in the overview
  • allowing removal of unwanted derivations
  • filtering out dependent derivations
  • sorting suggestions by relevance
  • fixing various bugs and problems with the data
  • displaying state changes in an activity log

@alejandrosame will join us to finish some backend work left over from summer. The plan is to reach the second milestone end of next week or beginning of the week after next, with the goal to have a triaging workflow that feels about right, even if it’s still slow. We’ll then continue with implementing “editing a draft security issue”.

Given all the complications so far, it’s likely we won’t get to “publishing issues” by mid December. It seems there’s still a lot to do for just making the thing run continuously. The development setup is finicky. It’s a lot of data, and many moving parts on the way to presenting it. Ensuring minimal stability always took the largest part of the time, already since the beginning more than a year ago. This is why the user-facing features still appear rather basic so far.

But overall this is finally coming together! By skimming through the data we already found some vulnerable packages. And it promises to become quite satisfying to discover even more, so I hope that more people will pick it up once it’s deployed in production. Even with some comfortable automation lacking, I’m convinced the insights the security tracker will provide to the security team and maintainers will make a big difference for making Nixpkgs and NixOS a reality day to day.

14 Likes

I wonder if folks working on this project evaluated GitHub - tiiuae/sbomnix: A suite of utilities to help with software supply chain challenges on nix targets and determined whether the work there might be useful in this project?

2 Likes

Yes, we did. @raboof even contributed code. That type of local scanner solves a different problem: which packages do I have on my system that may be affected by vulnerabilities? The online security tracker is for distribution maintainers and helps answering: Which vulnerable packages do we have in our distribution channels? The first is useless without the second, because you can’t upgrade if there’s nothing to upgrade to. Maintainers need to know which fixes to apply and which mitigations to prioritise so users have a chance to run secure systems.

Local scanners also deal with different internal problems, mainly an abundance of false-positive matches. The system design for the development program here envisioned improving upon that as well, by providing more accurate data based on the upstream triaging.

6 Likes

I hear you.

Just for whatever it’s worth, In some contexts, I was using sbomnix to scan packages for known vulnerabilities on package sets that I would then distribute to many users. Although it is then some work to map those found vulns to an upgrade path in new releases, or a published patch. There could be false positives, and it did take some tuning, but after some effort I did not experience an abundance of false positives.

But in any case, thanks for answering that, and it seems reasonable to me that this wasn’t a good fit for the use case of nixpkgs supply chain security now that I understand more.

3 Likes

We’ve come a long way since the last update, and are looking forward to our demo day this Friday 2024-12-13T12:00:00Z with the NixOS security team and anyone interested to participate (send me a message to get a calendar invitation).

Triaging is now indeed quick thanks to a number of optimisations, and @erictapen has taken great care to provide as much relevant information as possible at a glance while not overloading the user interface with noise. Last week we closed the second milestone.

At the time of writing, we’ve implemented 17 user stories (distinct workflows or behaviors) and addressed 30 other issues. The system is visibly taking shape.

We’ll spend the next days on smoothing down the most obvious rough edges and doing minor cleanups[1], expecting to finish 6-9 more user stories.

This will leave us with ca. 30 user stories and more than 50 other issues “discovered” on the way, which better be addressed before committing to a production deployment. While there’s always more work to do, it seems like we have now done the hardest, first 30% of it. This was made possible thanks to the investment by the Sovereign Tech Fund, detailed in the top post. After a successful presentation, we hope to obtain the means to get through the next 30-50% in 2025.

Ping me if you want to join the private Matrix room for beta testing or participate the demo.


  1. Cleanups such as fixing a typo in the CSS class name that messed up the highlighted hint in the screenshot after a renaming. These things simply require playing around with the application while paying attention to details. ↩︎

13 Likes