Nixtract 0.1.0 - extract graph data from Nix code

Hi all! I am excited to share somthing I have been working on internally at Tweag with a few others!

Our goal is to provide a simple solution to extract the dependency graph from any Nix code base (though for now it can only extract from flakes) of all the software packaged. The long term goal is to start building an ecosystem of tools to help operationalize Nix-based packaging.

Exciting ideas from the top of my head:

  • build a database of all the things ever packaged on nixpkgs
  • combine with a Nix binary cache to store the closure size of every derivation
  • combine with a vulnerability database
  • combine with nix-index to know automatically the proper derivation to get a file

To read more about it:

The current implementation may not be ideal, there will be bugs, but all that will improve as we go. At least that’s how I see things :smiley:

19 Likes

Uh that looks so nice! thanks a lot! I’ve been thinking about various tools in the past which could make use of something like this! :heart:

1 Like

Sounds nice!

Hasn’t your project some overlap with efforts happening on SBOM generators ?

Well spotted :wink: it’s not just some overlap. :slight_smile:

when I left Tweag, @Arsleust took over the project :wink:

1 Like

Nice. Being able to generate SBOMs conveniently and reliable will help Nix adoption a lot in regulated industries.

2 Likes

I immediately went to try using it with nix run- only to find it’s not packaged as a flake. I sort of assumed a flake utility would be packaged as one.

Thanks for the feedback. nixtract can now be run with Nix using its flake. :slight_smile:

Ref: Package with Nix by GuillaumeDesforges · Pull Request #4 · tweag/nixtract · GitHub

1 Like

sadly that flake only has x86_64-linux no aarch64 or even darwin

1 Like

I still hope an effort is made to make this stuff integrate with GitHub - nikstur/bombon: Nix CycloneDX Software Bills of Materials (SBOMs) given that is already there and works quite well. Has it been on Tweag’s radar? It’d be a real shame if two groups of people are developing the exact same thing without any coordination.

1 Like

I have too little insight to know whether merging them would make sense, and either way that’s of course for the respective maintainers to decide. But while we are collecting related works, I want to mention GitHub - tiiuae/sbomnix: A suite of utilities to help with software supply chain challenges on nix targets and its sub project https://github.com/tiiuae/sbomnix/blob/b920aa90ee291defa00e1757d81399f37c76c853/doc/nixgraph.md (no affiliation)

Back in time, I evaluated both projects, but it was decided to start a third one reusing the ongoing work of the nixpkgs graph analyzer.

It was silly of me. This has been fixed :+1:
https://github.com/tweag/nixtract/commit/3c20d3cceffd46e847f364679073cf26e0b173c2

1 Like

As Solène said above, we knew about Bombon before starting Nixtract.
The two project share many similarities, yet as far as I understand they have major differences:

  • bombon focuses on CycloneDX generation, nixtract aims to be pluggable into many use cases
  • bombon makes the SBOM for a derivation, nixtract works on a whole flake (it’d be possible to make it more versatile)
  • bombon builds the SBOM at once, nixtract streams information
  • bombon builds the closure inside Nix evaluation through recursion, nixtract uses two steps and spawns many small Nix interpreters to limit memory usage

Of course one can take inspiration from the other!

2 Likes

Just saying that nix-eval-jobs also exposes input derivations, which could be used to build efficiently figure out what build dependencies graphs.

That’s a great point! nix-eval-jobs is very similar. It describes Nix derivations and outputs a JSONL stream. It is even a few steps ahead of nixtract on some technical aspects (commanding the evaluation directly with C++ instead of spawning Nix processes).
Funny thing is, I learned about it quite late after the first working implementation of nixtract, so it is quite interesting to see how the design of nixtract converged in many ways to that of nix-eval-jobs.

To my understanding though, nix-eval-jobs does not recurse into the inputs of the derivation to describe the whole graph of dependencies, which nixtract does (and which introduces some technical challenges).

So they are not quite the same, but they do have many things in common, and I keep that in mind for future development.

There is a --force-recurse option as well, but i you already got some code that can recurse maybe that could be combined somehow. Do you have numbers for peak memory usage? Looking at ofborg if you don’t restart evaluation that could be quite high or can you split evaluation?

1 Like

Nothing like an undocumented option :smile:

I’m interested in exploring nix-eval-jobs. It’s directly using C++ which is a huge plus compared to using Python subprocess (I was looking at possibly using C bindings for nixtract: (Towards) stable C bindings for libutil, libexpr by yorickvP · Pull Request #8699 · NixOS/nix · GitHub).

I tried running on our test flake for nixtract that has only pkg2 <- pkg1, but nix-eval-jobs did not recurse into the build inputs.

$ nix-eval-jobs --force-recurse --flake github:tweag/nixtract?dir=tests/fixtures/flake-direct-buildInput#packages
warning: unknown setting 'allowed-users'
warning: unknown setting 'bash-prompt-prefix'
warning: unknown setting 'trusted-users'
warning: `--gc-roots-dir' not specified
{"attr":"aarch64-darwin.default","attrPath":["aarch64-darwin","default"],"drvPath":"/nix/store/37f0brjzkc89lrdi65v6w1kbgy5b7y56-pkg2.drv","name":"pkg2","outputs":{"out":"/nix/store/qacd4n10pilgn7bglw19s1anxi67ldm0-pkg2"},"system":"aarch64-darwin"}
{"attr":"aarch64-linux.default","attrPath":["aarch64-linux","default"],"drvPath":"/nix/store/531p2iky51lb1myq5l07xmmx1mqqxwda-pkg2.drv","name":"pkg2","outputs":{"out":"/nix/store/f7m2ngq17h5igy0zjn21fk3bj0dlsgbh-pkg2"},"system":"aarch64-linux"}
{"attr":"x86_64-darwin.default","attrPath":["x86_64-darwin","default"],"drvPath":"/nix/store/86xn1v7jf5asivpjbsl1ag012gf7551p-pkg2.drv","name":"pkg2","outputs":{"out":"/nix/store/xz3wjaxr7w5lwr9dx717lfkzfq3l6c6m-pkg2"},"system":"x86_64-darwin"}
{"attr":"x86_64-linux.default","attrPath":["x86_64-linux","default"],"drvPath":"/nix/store/cpkqmp1f40h7wpzz8d1fwyms3f9bi35v-pkg2.drv","name":"pkg2","outputs":{"out":"/nix/store/rap4q86aahlzybjsz00cldsi726rhqz2-pkg2"},"system":"x86_64-linux"}

This does not display anything for pkg1.
So I still think nixtract does something that nix-eval-jobs does not (yet?).

However it should be possible to improve nixtract using ideas from nix-eval-jobs, especially for performances (nixtract takes a few hours to extract data from the entire nixpkgs).

You can now find nixtract on PyPI as nixtract-cli: nixtract-cli · PyPI