Flakes Blocker for Org Users: Github ratelimit and need for flake mirroring

pwaller · December 30, 2024, 12:03pm

Org user(s) blocker

Users within large orgs, first using nix flakes, encounter a github rate limit (owing to a large number of users sharing an IP). The obvious workarounds are a poor fit and hinder adoption.

I’m interested in making nix more widely usable in my organization, but the first thing users hit is that when nix run nixpkgs#something reaches out to the github API to query the latest commit, it hits a rate limit. Users are using shared machines where they are reluctant to put security tokens, so setting a token is not a great workaround. This occurs in all sorts of scenarios with flakes, for example running nix flake update; a registry pin is insufficent. What would be ideal is to have a way to configure nixpkgs to use an internal mirror instead; (e.g. to be able to insert a redirect for github:<foo> to gitiles:<url>/<foo>) that nix respects in all cases, similar to git’s “insteadOf” (Git - git-config Documentation).

The obvious workarounds I’ve tried result in creating full clones of nixpkgs, which is not great.

picnoir · December 30, 2024, 1:47pm

I moved this post to a new thread: the community team updates thread is meant to be used by the NixOS teams to post their status update.

waffle8946 · December 30, 2024, 5:19pm

FWIW a blobless (NOT treeless) clone is massively quicker than a full clone:

git clone --filter=blob:none https://github.com/NixOS/nixpkgs.git

pwaller · December 30, 2024, 5:20pm

Thanks for splitting it off as you deem appropriate.

I felt it was on topic owing to the text:

These are the topics where we need more information, but your scope is not limited to these.

Organizational/Commercial Nix Users

If you use Nix/NixOS as critical infrastructure and have feedback/topics of note.

But I’m unclear exactly; is there a team for this? If not to find people to coordinate with there, how? This problem is an immediate hinderance with no obvious way to proceed (and I guess many will be in the same boat and simply give up) so I just wanted to raise my hand and show there is an issue here.

pwaller · December 30, 2024, 5:23pm

That’s fine and all but how can one make commands from the nixpkgs manual/community work out of the box, such as nix flake update on a flake? Or just nix run nixpkgs#hello? This is the hinderance to adoption.

You can of course configure a flake registry entry nixpkgs => git://internal fork of nixpkgs, but this would result in very poor performance since it results in a full git clone internally to nix. I’ve seen this take upwards of 30 minutes. It also doesn’t solve the nix flake update scenario since many flakes specify github:nixos/nixpkgs as the flake URL and therefore will expect to fetch from github in any case (as well as having github encoded in the lockfile).

nix run nixpkgs#hello is (somewhat) efficient because it uses the github API to query the latest commit and then an archive endpoint to fetch a tree for that commit. So far as I can see this is only supported but github and gitlab, which provide those. I am able to get an internal mirror on a gitiles for example, but A) gitiles isn’t supported by nix yet (it would be good if it did) and B) there is no way currently to transparently redirect requests from github:<url> to gitiles:internal-mirror/<url> which would be the needed functionality to avoid being rate limited by external providers such as github.

waffle8946 · December 30, 2024, 5:26pm

True, I don’t know of a good solution to that, honestly. The whole “copy all the code for purity” behaviour is what causes poor perf with flakes.

Atemu · December 30, 2024, 6:14pm

I’m a bit confused because flakes are not stable by a long shot and aren’t documented as anything but an experimental feature. What makes you think they’re production-ready?

None of the legacy commands make any connection to GH APIs because they rely on the local NIX_PATH that is always updated explicitly using nix-channel or any other way to deploy a nixpkgs checkout onto a machine in a certain path.

If you really do want to make this work with flakes, in order to modify what the nixpkgs part in nixpkgs#foo points to, you need to modify the flake registry.

If you want to use flakes to ease dependency on external Nix projects however, you’re very quickly going to run into issues again because many of them will be hosted on Github; there’s no way around that.

If you as a company have trouble running into GH rate limits, you should look into e.g. setting up a proxy for GH at your company that injects an API key. Perhaps GH has docs on how to deal with rate limits as an org. It’s not really a Nix issue tbh.

emily · December 30, 2024, 6:48pm

You can make GitHub tokens with no special read or write permissions. For organizations it may make sense to have a stub account to generate these zero‐permission‐grant tokens to deploy in the Nix configuration to end users.

The UX of the GitHub API requirement sucks for regular end users too though and I wish it wasn’t a thing.

waffle8946 · December 30, 2024, 7:27pm

That’s technically true, but there’s no evidence that we are moving hosting platforms as part of flakes stabilisation. I will point out that currently the only alternative (that I know of) to GH from nixpkgs directly is using nixpkgs from the closed-source flakehub as a flake input.

Though if your org is prepared to accept that risk, @pwaller you may be able to go that flakehub route.

pwaller · December 30, 2024, 7:27pm

I’m a bit confused because flakes are not stable by a long shot and aren’t documented as anything but an experimental feature. What makes you think they’re production-ready?

I would like to avoid this becoming a debate about the readiness of flakes, and I think it is irrelevant to the point I am trying to make, which is forward-looking. In my mind this would be an important prerequisite to production readiness so I would love to see some kind of indication that this problem is understood and possibly even on a roadmap or something.

That said, users will try those commands today either way and find they are broken, and think or say that nix is broken, and potentially give up, or at least encounter substantial friction.

If you as a company have trouble running into GH rate limits,

We have mirrors of github repostiories which are centrally managed. It’s just that I can’t point nix flakes at them, because nix flakes don’t currently have a reasonable way to express that I need to use a mirror.

e.g. setting up a proxy for GH at your company that injects an API key.

Maybe, but that’s not a solution I can personally make happen, and I suspect would run into individual API key limits as well. It would also introduce problems of ownership/management of the API key.

–

All that is to say, the company has solved these problems internally for git. Just not in a way that I can currently put to use straightforwardly with nix.

pwaller · December 30, 2024, 7:34pm

Existence of flakehub or not (or any other mirror) there is still the problem, again, of supporting standard invocations like nix run nixpkgs#foobar and nix flake update where the flake points to github:nixos/nixpkgs?ref=. If there was some way to override the (non-ref) part of the flake reference, which affected lock files, then flakehub would become relevant in that at least it would be possible to redirect those requests there. But this would still need some functionality in nix, unless I’m mistaken. That said, introducing more external dependencies which might hit rate limiting without authentication, or are not in control of the org, is again a deal breaker for what I’m thinking about. Ideally these things come from the internally managed mirror.

waffle8946 · December 30, 2024, 7:41pm

Naturally, you would have to replace that with the appropriate FH url (I don’t use FH so I don’t know the format offhand).

But yes, I understand the concern about FH having its own rate limits… the cynic in me has some thoughts on that Either way, like I said, it doesn’t seem like there is any movement to address this oft-discussed problem.

tomberek · December 30, 2024, 8:19pm

Not sure what the best approach is. There are some plumbing commands and approaches, but not clear what the best porcelain is here.

Some options:

nix flake metadata 'git+ssh://git@github.com/NixOS/nixpkgs?shallow=1' which can point to internal repos
nix flake metadata 'github:NixOS/nixpkgs?host=github.com' which can point to other hosts if they support a similar API
alias to nix flake update --override-input nixpkgs something-else
alias to nix flake update --inputs-from https://internal.org/registry.json
convert flake.lock flake.internal.lock && nix flake update --reference-lockfile ./flake.internal.lock
A single global rewrite of those hosts could be done, very similar to having an alternative /etc/hosts?

What would an ideal solution here be? A config option:

host-redirects = github.com=git.internal.org another.server.com=server.internal.org

or

url-redirects = github:=git+ssh://git@internal.github.com/

I’m sure that would run into a few usability issues regarding client vs server settings… if these rewrites applied or not to fetches that occur in the sandbox vs builtins, and so forth. And the associated auth issues.

There seem to be two issues:

nix run nixpkgs#... can be addressed with a company-wide registry setting.
nix flake update might need both a registry setting and a small wrapper injecting an override-input or inputs-from.

A seemingly relevant issue, stale, but might be good to add suggestions/ideas: mirror:// support · Issue #6145 · NixOS/nix · GitHub

emily · December 30, 2024, 8:33pm

Hydra distributes Nixpkgs tarballs, and ~always has. You can use e.g. https://channels.nixos.org/nixos-unstable/nixexprs.tar.xz with flakes.

(I am not saying this solves the overall problem in this thread, just addressing this narrow claim.)

Atemu · December 30, 2024, 8:37pm

Whil that is technically true, there isn’t really a good way to lock it using flakes, so you’ll have nix dependencies move away under you and suffer slow eval times because 40-50MiB need to be copied from the internet to the nix store all the time.

pwaller · December 30, 2024, 8:44pm

This is I think hits the crux of what I’m saying. If flakes are to work in this scenario, it is a requirement that the user doesn’t have to edit the flake.nix or flake lock file. A user needs to be able to take an arbitrary flake from the internet and have it work. I wish to provide a configuration to nix where it works transparently, by redirecting problematic requests to internal mirrors.

(while holding the other properties discussed above, e.g. ideally not requiring any github API keys floating around anywhere, and ideally not requiring a proxy which MITMs github)

pwaller · December 30, 2024, 8:50pm

As in my most recent comment, my ideal solution looks something like what you suggested. A nix configuration option, which enables using internal mirrors for problematic hosts such as github, by redirecting those somewhere internal would be ideal.

I didn’t fully understand what you were saying about nix flake metadata; what does nix flake metadata <flake> do other than show the metadata for that flake?

--override-input completely overrides the input so wouldn’t allow overriding just the host part (or ‘type’ part, if necessary, for example).
converting existing flake files does not seem ideal; I want flakes to work inside the firewall just as they work outside the firewall, using the same commands and the same files. Having downstream forks seems distinctly non-ideal.
The global rewrite seems ideal. I would use something like the precedent from git: Git - git-config Documentation. As mentioned earlier, this enables an equivalent thing in git: I can git clone https://<github url> in a script, and if a global gitconfig exists with an insteadOf directive, the clone can be transparently redirected elsewhere.

pwaller · December 30, 2024, 8:52pm

I agree with this broadly, and (2) is definitely the harder and most important part. (Although users are more likely to notice (1) first). I don’t see how to make it work with override-input or inputs-from though since that would result in a different lockfile if you’re “inside the firewall” vs “outside the firewall” and it’s important that they are the same. If github appears in the flake/lockfile without the miror, it should appear exactly the same with the mirror, since the mirror is more like an implementation detail of the network.

arijoon · December 31, 2024, 3:48pm

one of the problems I have with the current flakes cli is that nixpkgs is not pinned by default. One might argue this is more convenient but unsure how well advertised pinning it is:

> cat ~/.config/nix/registry.json
{
  "flakes": [
    {
      "exact": true,
      "from": {
        "id": "nixpkgs",
        "type": "indirect"
      },
      "to": {
        "lastModified": 1717179513,
        "narHash": "sha256-vboIEwIQojofItm2xGCdZCzW96U85l9nDW3ifMuAIdM=",
        "path": "/nix/store/wzx1ba5hqqfa23vfrvqmfmkpj25p37mr-source",
        "rev": "63dacb46bf939521bdc93981b4cbb7ecb58427a0",
        "type": "path"
      }
    }
  ],
  "version": 2
}

Above registry entry is locking my nixpkgs reference when using flakes, so when I do nixpkgs#hello it won’t try to download nixpkgs again (it’s downloaded once). This is somewhat similar to channels. I use home-manager to generate that registry, but it can be done manually too, especially if using shared machines, tarball can be downloaded from an internal source and added to the store with a simple script.

For more versatile ways of setting up mirrors tomberek’s post covers it all

waffle8946 · January 1, 2025, 2:10am

Not exactly true, on NixOS it is pinned to the nixpkgs revision used to build your system’s config. But nix on foreign distros, yes, the registry entry is unpinned, as it’s not really obvious what it should be pinned to anyway, unless you use system-manager. (And at that point using system-manager and/or HM to manage that registry entry makes more sense.)