There are all kinds of distro-agnostic tools and file formats that try talk about software packages and versions - recently, there’s a lot of activity around SBOM and security scanning.
Identifying software is always a recurring problem for those. One promising emerging standard for this is package-url (purl). I think it is time we start defining how to refer to Nix packages using purl’s.
This has been discussed in a couple of places, such as Add guix and nix as package types · Issue #149 · package-url/purl-spec · GitHub, The future of the vulnerability roundups, Things to learn from tea.xyz, Add Nix cataloger by wagoodman · Pull Request #1696 · anchore/syft · GitHub and in the #slsa:nixos.org
matrix channel (not sure if there’s any public history I can link to?).
To give a super-quick introduction to purl, a purl is typically of the structure scheme:type/namespace/name@version?qualifiers#subpath
, where scheme
is always pkg
, type
is a type from the registry at https://github.com/package-url/purl-spec/blob/c02b002f09bdc88a501f62259eec18761957828a/PURL-TYPES.rst, and namespace, name and qualifiers are type-specific. We could define a nix
purl type and decide how to populate it.
You’ll notice ‘the same’ software could be present in multiple type
s. This is intentional and useful: that way you can distinguish between information about ‘software X’ generally and information about ‘software X as packaged in NixOS’.
I think the nix
purl type should be symbolic enough so tools have enough information to perform some level of ‘fuzzy matching’, but can also contains all the information to know exactly how to recreate that specific build of a package.
Of course, we already have a format to refer to Nix packages: flake URI’s. As a straw man to get the discussion started, I would like to propose a definition of the nix purl type as a sort of different representation of the flake URI (since they have slightly different rules). I came up with some rules for defaults to make it succinct to refer to nixpkgs packages, but keep things general enough to also use this type to refer to any 3rd-party nix package:
pkg:nix/[<org>/]<attr>?<qualifiers>
Where:
-
org
defaults toNixOS
when not specified -
attr
is the attribute path to the package
And the following qualifiers can be added:
-
type
: corresponds to the Flake type. For the NixOS org, for now default togithub
(though we can reserve the right to change change this default in the future, as long as history is kept across forges) -
repo
: the GitHub repo under theorg
. Defaults tonixpkgs
when the org isNixOS
, otherwise to (the first segment of) the attribute path -
ref
: tag or branch in the repo -
rev
: revision, which must be part of theref
tree -
output
: the derivation output, default toout
This leads to the following examples (purl and flake syntax side-by-side):
purl | flake |
---|---|
pkg:nix/wget |
github:NixOS/nixpkgs#wget |
pkg:nix/wget@1.21.3?ref=nixos-unstable&rev=897876e4c484f1e8f92009fd11b7d988a121a4e7 |
github:NixOS/nixpkgs?rev=897876e4c484f1e8f92009fd11b7d988a121a4e7#wget |
pkg:nix/tiiuea/sbomnix?type=github |
github:tiiuea/sbomnix#sbomnix |
pkg:nix/tiiuea/nixgraph?type=github&repo=sbomnix |
github:tiiuea/sbomnix#nixgraph |
pkg:nix/python3Packages.enamlx |
github:NixOS/nixpkgs#python3Packages.enamlx |
pkg:nix/eicas/omeka-s?type=git+https://codeberg.org&rev=bfe132f6540a175beb432c2c95472f929cbf310f |
git+https://codeberg.org/eicas/omeka-s-flake?rev=bfe132f6540a175beb432c2c95472f929cbf310f #omeka-s
|
pkg:nix/grub2@2.06?output=doc&ref=nixos-unstable&rev=897876e4c484f1e8f92009fd11b7d988a121a4e7 |
github:NixOS/nixpkgs?rev=897876e4c484f1e8f92009fd11b7d988a121a4e7#grub2!out |
Now this is different from what’s being proposed in syft: they seem to just take the pname (?) and add the output hash. I can see how that is much easier for a filesystem scanning tool such as syft to discover, but it also seems much less useful: it is almost impossible from such a purl to ‘work backwards’ and find the exact derivation without additional context.
Should we ‘allow’ both ‘output-centric’ and ‘input-centric’ purls for the nix
type? That seems like while it’d make ‘creating’ purls much easier for some cases, it also might make doing anything useful based on them much harder…