Crate2nix: Setting `codegenUnits` for all crates

I’m using crate2nix to build a Rust project with a large amount of Crate dependencies. This of course takes a long time.

I would like to set the value of codegenUnits that is provided to buildRustCrate to the rustc/Cargo default of 16 for every Crate instead of the default 1. I could set this for each individual crate, but this would be unmaintainable. Is it possible to override the default for buildRustCrate as called by crate2nix?

I opened a PR last year to fix this. It got a negative review which I can only summarize as “I don’t think this will speed up builds please do some benchmarking to prove me wrong” and arguing about it was not really a high priority for me. It’s pretty obvious to anybody who uses buildRustCrate regularly that this makes builds way faster.

So I just carry this patch in my local tree instead. It still works great. Even better with -Zthreads=$NIX_BUILD_CORES :wink:

Also, @katexochen you have the wrong link in this comment. The correct link is:

https://github.com/NixOS/nixpkgs/issues/50105#issuecomment-1909437434

I still don’t miss having a git-hub account!

From what I can see the comment says:

Looking at buildRustCrate: nondeterministic across different values of NIX_BUILD_CORES · Issue #130309 · NixOS/nixpkgs · GitHub I can’t see evidence of benchmarks being performed, only two votes in favour of disabling parallelism and none against.

I’m willing to raise a PR to increase codegen-units to 16 if I can actually find evidence of it improving compile times while not making a significant difference to runtime performance and output size.

Ideally we could override this in crate2nix since doing it for an individual invocation of buildRustCrate is easy enough, not so easy for hundreds of invocations.

I was looking at Fix rust nondeterminism by davidscherer · Pull Request #170981 · NixOS/nixpkgs · GitHub re: benchmarks, though they didn’t mention the runtime performance differences

the simplest way to resolve that would be, of course, benchmarking

In any case changing the default isn’t unreasonable, but having numbers will be far more convincing to potential reviewers.

I’m working on it, but it would be good to know if we have a standardised benchmark for this kind of thing

1 Like

I’ll have to look more into why, but increasing Nix’s cores setting and increasing the cores allocated to the VM reduced the build times for COSMIC Desktop by a few hours. That was however using buildRustPackage and not buildRustCrate, however this does at least imply a large benefit to increase codegen units (assuming that’s what the cores setting affects with buildRustPackage).

I finally got around to attempting a benchmark using the fhir-sdk crate (one that I’ve been working with which consistently takes up a majority of the build time).

Unless I’ve somehow set codegen-units wrong (all I’m doing is setting the RUSTFLAGS environment variable to -Ccodegen-units=a number), increasing codegen-units beyond 1 actually makes the build slower.

All testing was performed on an AWS Graviton4 CPU with 16 cores allocated with Rust 1.81.0. Results are as follows:

  • 1 codegen unit: 34m 3s
  • 10 codegen units: 39m 6s
  • 16 codegen units: 40m 3s

Even setting -Zthreads=16 on a Nightly compiler had a negative effect.

More details are viewable at feat: nix flake for compiler benchmarking · drakon64/fhir-sdk@ac79c86 · GitHub.

Are those times wallclock times or were you using hyperfine/time/other similar command?
If the latter it’d be interesting to see the full results.

1 Like

time, you can see the Flake at fhir-sdk/flake.nix at ac79c86b12fc8101099507c14611521bf2cdfed0 · drakon64/fhir-sdk · GitHub

Sorry, I fixed the link in the comment.