Could pnpm2nix be derivation-per-package?

I’ve been using the most up-to-date version of pnpm2nix which is GitHub - FliegendeWurst/pnpm2nix-nzbr: Build packages using pnpm with nix, but it seems to result in huge Docker images that contain many packages that are not necessary at runtime. As far as I can tell, it just adds every package named in the lockfile and includes them all in a single Nix derivation. It even includes all the tarballs for all the packages as separate Nix derivations.

I think a better design would have a Nix derivation per package. We rely on pnpm to resolve all the dependencies and write the dependency graph to the lockfile, and for each package the node_modules directory will include soft links for both direct and peer dependencies.

I’m pretty new to both Nix and the npm ecosystem and I’ve been relying on Claude Sonnet 4 to help me explore this idea and determine its feasibility, so I wanted to solicit feedback from people who know these areas better before launching into it.

The Problem with Current Approach

The existing pnpm2nix recreates pnpm’s entire .pnpm store structure in one large derivation. This means:

  • Massive Docker images: Everything in the lockfile gets included, even packages only used by dev dependencies or unused transitive dependencies
  • Poor caching: Changing any dependency rebuilds the entire dependency tree
  • No Nix integration: Can’t use overlays, package overrides, or other Nix tooling on individual packages

Proposed Approach: One Derivation Per Package

Instead of one big derivation, create individual derivations for each resolved package context:

  • react@19.1.0 → one derivation
  • some-ui-lib@1.0.0(react@19.1.0)(typescript@5.3.2) → different derivation (peer deps create separate contexts)
  • some-ui-lib@1.0.0(react@18.0.0)(typescript@5.3.2) → yet another derivation

Each derivation would contain the package’s extracted files, and include symlinks to both direct and peer dependencies.

Why This Might Work

Ecosystem resilience: The JavaScript ecosystem already works with diverse package managers (npm flattening, pnpm symlinks, Yarn PnP eliminating node_modules entirely). Packages that work with these should work with our symlink approach.

pnpm already solved the hard parts: The lockfile contains the resolved dependency graph. We don’t need to reimplement dependency resolution - just faithfully recreate what pnpm already figured out.

Content-addressable alignment: Both pnpm and Nix use content-addressable storage. We’re translating between compatible systems.

Nix’s lazy evaluation: Only packages actually reachable from your application would get built and included, automatically excluding unused dependencies.

Benefits

  • Smaller images: Only include packages actually needed
  • Better caching: Individual package changes only rebuild affected derivations
  • Nix integration: Full support for overlays, overrides, etc.
  • Fine-grained layers: Each package becomes a Docker layer with buildLayeredImage

Implementation Approach

  1. Parse pnpm-lock.yaml (using yaml2json)
  2. Treat complex package identifiers like package@1.0.0(react@19.1.0) as opaque strings - we don’t need to understand what the parentheses mean, just use the whole string as a unique package identifier
  3. Create one derivation per unique package identifier
  4. Wire up dependencies based on lockfile relationships
  5. Build symlink structure for each package’s node_modules

Questions for the Community

Is this a good idea? Does this approach make sense from a Nix perspective?

What problems am I going to run into? I can think of a few potential issues:

  • Node.js module resolution behaving differently with symlinks vs the current approach
  • Workspace dependencies creating build ordering complexity
  • Native packages (.node files) having filesystem layout assumptions
  • Path length limits with deep Nix store symlink structures

Has this been tried before? Are there existing approaches I should look at?

Performance concerns? Would having many small derivations instead of one large one create build performance issues?

buildNpmPackage? As far as I can tell, buildNpmPackage doesn’t support pnpm and discussion of how to do so has petered out Add a `buildPnpmPackage` ? Or add pnpm build and install hooks? · Issue #317927 · NixOS/nixpkgs · GitHub but elsewhere it’s suggested this can work, what’s the situation? How to use `buildNpmPackage` for a pnpm project?

Testing strategy? What’s the best way to build automated tests for something like this - both unit and integration tests?

Thanks for any insights!

2 Likes

Hey please have a look into pnpm hooks, about the packages every package that is in the lock file includes packages from other libraries which is required at build time until your application is built with node. After build time your free to delete the packages in your store.

But I feel like the pnpm hook could be improved to where it downloads the packages from nix already via cache and read the lock file, if there is no package then it fetches it from npm.

One benefit is storage space, Performance slightly effected.

1 Like

Yes, and I don’t recall why it didn’t go through. It’s probably a good idea though.

1 Like

I had a look at that page, but AFAICT it’s the opposite of the granularity I want - any time I make any change, everything has to be rebuilt, and all the build-time dependencies as well as the run-time dependencies end up in a single layer in my Docker image unless I delete things by hand. With the solution I propose (combined with eg nix2container), Nix will automatically handle including only what is needed in the Docker image.

1 Like

It turns out that peer dependencies can result in circular dependencies, which means this approach won’t work as-is. Currently looking at using Tarjan’s algorithm to identify and bundle together the circular dependencies.

1 Like

For yarn berry, GitHub - madjam002/yarnpnp2nix: A performance focused and space efficient way of packaging NodeJS applications with Nix does derivation per package.

1 Like

Oh interesting, thanks! AFAICT this is possible because unlike pnpm, Yarn replaces Node’s module resolution algorithm, so it can add peer dependencies after the fact. Does that seem right? Thanks!

I implemented a pnpm builder myself with IFD, and in python ~3years ago. And those where my findings.

I think pnpm nowadays could be very promising.

Some of my latest thoughts, where i left off:

we still need support for yaml lockfiles. Or IFD, which is not allowed in nixpkgs and other repositories. Nix has an open PR for years, but thats not looking like its going to be merged anytime soon. (builtins.fromYAML)

Ideally we could provide a plugin or something similar to. Pnpm to load its dependencies from. Which would pnpm serve the dependencies from the nix store. what i did, was to reproduce pnpms link structure, which is really complex.
They say its not so complex, if you dont have peer dependencies. But those are used a lot really.

Some rare dependencies still need yarn compatibility db.
Which can be turned offf, but might result in a non working package. This might be minor. They rely on dependencies that are essentially not locked in the lockfile.

If we manage to find a solution for A and B that would allow pnpm to have a better story than npm in nix.

I’d love to do this without IFDs! However, in addition to builtins.buildYAML, we’d need a way of building efficient functional data structures. In eg Clojure, you can create a new dictionary which differs from an old one by one key, and it runs in O(1) time. With something like this you can implement Tarjan’s algorithm in O(n) time. I see no efficient way to find SCCs in pure Nix, so I’m using an IFD to a Rust package which implements Tarjan over JSON inputs.

1 Like

Why do we need to find SCCs again? With builtins.genericClosure you get a closure that contains one package exactly once.
Unless you need to rebuild some of the packages that should be fine? What do you want to with the SCCs. I guess grouping them together, but why exactly, could you remind me? (It been a while)

Once pnpm has resolved peer dependencies, there are cycles in the dependency graph described by the “snapshots” key in pnpm-lock.yaml. But in Nix, we can’t have package A depend on B and vice versa; whichever is built earlier cannot depend on the one built later. If there were a way to add a dependency after the build, that would help, but Node’s resolution algorithm is to search upwards based on the real location of the file, so there’s no way to add a dependency after a package in the Nix store is built. So I am handling circular dependencies like “pg@8.16.2” ↔ “pg-pool@3.10.1(pg@8.16.2)” by building a single Nix package that contains both.

whichever is built earlier cannot depend on the one built later.

The question is what your definition of built earlier means.
You typically don’t need to build a package like pg@8.16.2. Nix Build in this context typically refers to fetching a tarball from the npm registry. The tarball contains an already built package.
Fetching all is possible by utilizing builtins.genericClosure, which always gives you a graph without cycles.

So resolving the cycles at from my humble understanding is not necessary. Once you have fetched all packages. All of them are typically already built. Maybe some need to run untrusted installScripts (devilish ones). But we can and should defer that after we realize the actual node_modules folder.

If the pg@8.16.2 requires a cyclic peer to be present at runtime, we can simply create a symlink in the directory (which is what pnpm does right?) to satisfy that at runtime. Typically there is no real buildtime for a package. just a setup and install time for the root node of the graph, (your top level package) but at that point all dependencies exist and you can create arbitrary symlinks.

I’m afraid I don’t follow what you’re proposing. What’s the final on-disk layout of js files and node_modules soft links in the proposal you have in mind?

I want the js files for a package like pg@8.16.2 to live under /nix/store. Those files contain references to pg-pool dependencies. When Node tries to resolve those dependencies, it will search upwards from the location of the js file itself for a node_modules directory containing the reference. Because the files live in /nix/store, and because Node resolves based on the actual location after following all soft links, that means the node_modules directory must also live inside the same package in /nix/store and contain a soft link that resolves pg-pool@3.10.1(pg@8.16.2). For most such dependencies, that link can be to a different package in the Nix store. However, in this case, the package containing pg@8.16.2 must soft-link to the package containing pg-pool@3.10.1(pg@8.16.2), and the package containing pg-pool@3.10.1(pg@8.16.2) must soft-link to the package containing pg@8.16.2. In Nix, this is only possible if they are the same package.

So, if we generally want to do package-per-deriviation, but we want to cope with such cycles, we have to find the SCCs. The only alternative I see is to modify Node’s resolution algorithm, but that seems nontrivial.

1 Like