Although we’re not at a hundreds of gigabytes scale yet, the potential for free size reductions is certainly interesting. I also find the point about big directories leading to exceptionally big tree objects interesting. I wonder how our directories are/will be doing in this regard. RFC0140 estimate that we’d end up with one huge directory (>1000 entries) and a few big ones (200-300 entries). Currently, the directories under pkgs/by-name are still quite small since not many packages have been migrated.
I tested this on nixpkgs. A clean clone from Github comes out at ~5GB for me at the moment.
Running the first option from the post (repacking with --window 250) reduces the size to 1.7GB. Compiling Microsoft’s git and using the --path-walk option for repacking yields 1.9GB, so our repo is not an optimal use-case for the path-walking option.
Still, large window repacking more than halves (!) the repo size. Consider that cloning nixpkgs in e.g. Egypt currently costs several USD equivalent, it might be worth trying to figure out if we can get Github to run this sort of repacking on nixpkgs.