Just as an update.
I previously was able to convert arbitrary NPM URIs to Nix fetchers, I could also create module trees with linked bins as either node_modules/
or “global” style installations given structured inputs. I had a large collection of utilities to convert NPM or Yarn locks, and even package.json
with non-conflicting descriptors into my “structured inputs” and I even supported workspaces. This worked if you were willing to compose these tools together, but it wasn’t a “works out of the box” solution. You had to manually run any build steps and install phases which was the largest gap to fill.
Today I had a big breakthrough using NPM v2 style locks with “complex” workspaces. The format of the lockfile made it simple to structure trees automatically, and I can Nixify all fetchers, and dump trees that are equivalent to --ignore-scripts
invocations. My fetchers are abstracted to use either flake inputs, built-in fetchers, or Nixpkgs fetchers depending on which the user prefers; and I have a mechanism to force specific packages or pattern matched URIs to use a particular method ( this is necessary for a small number of tarballs which contain directory entries, since these will fail if builtin fetching is used ). The caching “works as expected” and will short circuit if it determines that fetched inputs match a stored build - this is NOT something most other utilities handled properly.
I still need to run the life cycle scripts, but I understand exactly which get run for various resolutions; I imagine this will be done in the next week or two. So far, I never invoke NPM or Yarn. I have pacote
as a stop gap for those obnoxious tarballs temporarily, and I have a utility that uses it to fetch and transform packuments/manifests - but this could be running as a full replacement in the future.
Notable highlights:
Individual modules are cached and composed separately - this is not the approach taken by most tools and it made an enormous impact on performance.
Intermediate phases are all cached individually as well, so changing node versions or various inputs will not trigger a full rebuild of the closure.
Memory consumption is less than half of either Yarn or NPM in equivalent phases. I haven’t done proper benchmarks yet but never exceeded 800 Mbs on --ignore-scripts
installs that consume over 20Gbs with Yarn and 16Gbs with NPM. CPU usage never exceeded 10%. With an empty cache and built-in fetchers I clocked 12 minutes where NPM took 20 and Yarn took 50 ( honestly fuck Yarn, this time is largely spent collecting metrics and unzipping/rezipping tarballs… ). I shuffled and renamed the output directories several times to ensure nothing got rebuilt - and I produced new node_modules/
trees in under 3 seconds.
So still work to be done, but I’d say “things are looking good” 
Honestly shouts out to Eelco and the Tweag folks for pushing the new UI stuff. I’ll admit relearning things drove me crazy at first, and the docs leave a lot to be desired right now; but the various new types of caches and fetchers can absolutely fly if you leverage them well ( and can dedicate the time to reading the sources a lot ). This UI still has kinks to work out and a long road ahead; but “I get it” now. I think in a while the various lang2nix
tools could be like - less shit potentially competitive and accessible to folks who don’t have much experience with Nix.