Flakes: Solving the n nixpkgs Problem

nrdxp · March 3, 2021, 3:28am

Flakes are amazing, but having to either pull in n copies of nixpkgs over the network, or maticulously manage flake inputs to add the proper follows in the proper places is a bit of a suboptimal solution, especially when we actually want different artifacts from different nixpkgs versions. Given the growing size of nixpkgs, I figured solving this problem would save a lot of bandwidth and time, and then it hit me — we could just use git itself.

Having a single full copy of the git repository locally, in a secured location, accesible by the Nix daemon, but not unpriviledged users may resolve or mitigate this issue substantially. A simple call to git could push the head of the repository to the given commit or ref, and upload that ref into the nix store. If a reference is requested and not already present, git could do a simple fetch operation to pull in new changes.

Of course this adds a dependency on git, which means this would have to be a completely optional and/or unofficial solution, but still; given the number of flakes that are bound to refer to nixpkgs in the wild, it might be worth the effort.

Just wanted to collect your thoughts, opinions, possible pitfalls or purity concerns.

edit

I went ahead and filed an issue, to see how upsteam feels about it.

siraben · March 3, 2021, 5:05am

An ad-hoc solution I’ve been doing is pinning nixpkgs via nix registry pin and overriding the flake input nix build --override-flake nixpkgs nixpkgs. But a less verbose solution would be nice to have.

nrdxp · March 3, 2021, 5:19am

Indeed, I’m also interested in a declarative solution for NixOS systems. This may also be possible using only the existing git fetcher and a local git checkout, and perhaps a timer to auto fetch the repo and update merge branch changes on regular basis.

DavHau · March 3, 2021, 6:14am

Amazing Idea!
I think we could already implement this without doing any changes to nix or the fetchers. We just need to point api.github.com to some localhost address via /etc/hosts and then re-implement the github tarball api to use a local git checkout and make tarballs out of it.

A problem with this could be to find the exact tar parameters to get the same sha256 which is produced by github.

fricklerhandwerk · March 3, 2021, 2:29pm

Have you ever used builtins.fetchGit or fetchFromGitHub on nixpkgs? It’s amazingly slow because the repo is, as you say, huge, and git does not scale all that well. For moderately sized n, pulling n full archives, from my experience, will still be faster than or as slow as doing n checkouts even on a fully-locally available repository. (Assuming > 100Mbit/s connection and SSD storage.)

Which is a shame… I always found that specifying branches, tags and commits much more convenient to both write and read. But once you start rebuilding and upgrading often enough, like… more than once a day, it’s just too slow to be practical.

nrdxp · March 3, 2021, 10:51pm

I may not have noticed this because I’ve been using NVME drives exclusively for a few years, and referencing local nixpkgs seems a lot faster on my machine than from the github remote. Although, I’d be very interested in doing some benchmarks to find out for sure in different setups.

@DavHau, that seems interesting as well. Do we really even have to compress the archive? I just tried a flake ref of the form git+file:///path/to/nixpkgs?ref=release-20.09 and another of git+file:///path/to/nixpkgs?ref=master inside a flake and it seemed to work beautifully. So it looks like most of the work is already done.

All that’s left is to potentially urge upstream to store a copy of nixpkgs locally for just this reason, and wire it up for the flake registry to look there first, either by default or at least as an option. Also we would need some logic to update the local remote if we reference something that isn’t already stored there.