Nix State of the SBOM

https://arnout.engelen.eu/blog/nix-state-of-the-sbom/

15 Likes

Genealogous actually also works on the “Nix-level”, which is why both bombon and it miss the ‘assets’ and ‘google-fonts’ dependencies. When I wrote Genealogous I wasn’t able to figure out how to access the string contexts from Nix code, hence those dependencies are missing. I assume bombon ran into the same issue.

About the lock file idea near the end of the post. I actually had a prototype for Rust packages to read the Cargo.lock file and include those dependencies, but due to time constraints this never made it to the main branch.

Ah, not sure how I got the impression that it worked on derivations - updated the post!

Hmm, interesting.

1 Like

It’s cool to see this survey of the landscape. Thank you for putting it together! At work we’ve been building out tooling for generating SBOMs. Right now, we’re generating SBOMs for all the packages in Nixpkgs that we support in Determinate Secure Packages. That’s useful as a start, but doesn’t go all the way out to our users’ software. Hopefully we’ll be able to publish our tooling in the next few months.

One thing that sticks out to me here is the copying of artifacts. It’s ~impossible to automatically detect duplication / copies generally, since an arbitrary mutation can be performed. However, it occurs to me that with some marking / tagging + automation we could get pretty close.

I’m going to leave it to @RossComputerGuy to share details since he’s been really close to this problem for the past few months, but I’ve been glad to see the work to improve the marking / metadata that is contained in Nixpkgs directly!

4 Likes

Hey, like @grahamc mentioned, I’ve been working closely with SBOMs for the past few months. A large problem I noticed is no metadata on the type of package. This is a huge problem as many formats like CycloneDX and SPDX declare a type property. Usually this includes things such as firmware, drivers, libraries, applications, etc. As we cannot detect based on the name or outputs alone, that would result in huge miss-identifications, we need a new metadata attribute for this.

A few other pieces of information we need are around vulnerabilities and patches. When providing an SBOM to grype for scanning, it’ll often indicate the package is out of date. Now CPEs and PURLs may fix a lot of that, adding –-distro nixos:25.11 has gotten rid of a lot of those in grype. The solution would be to add more metadata around known vulnerabilities and fixed vulnerabilities. It would also make sense to add metadata around patches themselves. In CycloneDX, we have a way to indicate what each patch does. This can be very useful for providence information.

Aside from that, there’s probably others I can find but aren’t of immediate concern. I will be giving a talk at FOSDEM & Planet Nix on SBOMs which will have a lot more information.

5 Likes

More context in context string ¡ Issue #4677 ¡ NixOS/nix ¡ GitHub would make stringWithContext easier .

See details here: how to deal with string contexts ¡ Issue #74 ¡ nikstur/bombon ¡ GitHub

As a workaround I made a prototype that recovers the string context to a set of known derivation expressions here: how to deal with string contexts ¡ Issue #74 ¡ nikstur/bombon ¡ GitHub

2 Likes

I saw your work in https://github.com/NixOS/nixpkgs/pull/459543 - do you have any data on how big that problem actually is? I know SBOM formats have this as a required field, but to be honest I had the impression that most SBOM tools mostly ignore it. As mentioned in that PR it does seem helpful to be able to record this per derivation output, as it seems fairly common for one derivation to both produce a ‘library’ and an ‘application’ side…

Interesting, what does that do?

Information about vulnerabilities that we are ‘known not to be affected by’ even though we’re on a potentially-affected version can be helpful. The current convention of checking for patches with CVE’s in the name and assuming they’re ‘fixes’ seems surprisingly effective :wink:.

While populating ‘knownVulnerabilities’ is nice, I don’t think they really belong in SBOMs for vulnerability scanning: often a newly-found vulnerability will affect old versions, and I definitely want the scanners to flag it when I’m still on an older version of nixpkgs and components I use have advisories that were not known back then yet.

2 Likes

I’m not aware of how big the problem is. However, it may help with making detecting things more accurate. I’ve also had people ask “why is everything a library when this is an application?” Having this accurately be recorded in meta definitely helps.

Yeah however I’m not sure how well SBOM formats handle things being the same package but a different output. It also means we’d have to record how something was consumed as a dependency. Even the dependencies list requires the package type property.

I’m not entirely sure, it does add distro information into the grype JSON I generate. It seems to change things when checking with the vulnerability databases.

I’ve tried checking patch names with CVEs in the name but have had little to no success. Also not every patch has a CVE name in it that fixes a security vulnerability. There’s also the problem of databases not being accurate. With the databases being inaccurate, it says a vulnerability is affected but doesn’t know when it was patched and cannot match the vulnerability. Being able to record patches that have already been merged that fix vulnerabilities can definitely help. Especially in the case we use an unstable package version or when the database is inaccurate.

1 Like

I agree. My $dayjob consists of CVE remediation for a RHEL-derivative distribution. So a 36+ month LTS nixpkgs branch is quite interesting for me to work on. As such, I’ve often wondered how this should be handed and have come to a similar conclusion that meta.knownVulnerabilities is a good solution for an unstable, rolling release branch of nixpkgs. But that’s now how you deal with vulnerabilities on a longterm branch that is 2+ years behind on their upstreams’ versions.

For a real competitor to Debian/Ubuntu/RHEL in terms of lifecycle duration, a better solution is to assume version released before the CVE’s initial disclosure is vulnerable unless explicitly specified otherwise. That’s what we do at $dayjob. As such, meta.remediatedCVEs fits better here. Folks working on the stable branch at a corporation like RedHat/Canonical would either bump the version (rarer than unicorns) to match the remediated version from upstream or backport a patch with multiple commits as their dependencies and issue a security advisory saying “Yes, vx.y.z is old and might be flagged by your security scanner, but we have in fact patched the CVE so be rest assured. Here’s our CSAF file that your security scanner can pick up and hopefully (there are misses at times, unfortunately) it gives you the correct status of your system’s state.”

I’ve worked on CVEs for userspace packages and the kernel and the biggest benefit that nixpkgs has here over almost every distribution in longterm maintenance (except for maybe Guix; looks similar to NixOS since I guess Guix is a fork of Nix but never actually tried it) is that you can actually test CVE regressions and ensure that you get consistent results since NixOS is a source-based distribution. So many more advantages but I think this post is getting a bit too ADHD-induced-ramble-ly from the main topic and quoted text. I’ll stop here.

2 Likes