Nix monorepo size and contribution

I want to add the following text somewhere, where would be a good place? (after clean up of course)

Quick guide to git workflow specific to nixpkgs (as it relates to its humongous size)

This is a big repo and some might not be very comfortable downloading the entire repo with history which is a couple of GBs. Git allows one to download a “shallow” copy of the repository by the means of which you can download only the current HEAD (Currently ONLY ~200MB). Full treatment of the “shallow” checkout related commands is outside the scope of this document but here’s some very quick overview of commands that may help you work with this repo. (As with git and any other system- commands differ based on situation and be wary of data loss if you don’t understand what you’re doing).

# The following command checks out only the HEAD of this repository
git clone --depth=1

# Create a branch locally and on the Github fork with the same name.  After making and committing changes, push using:
git checkout -b myPackageEdit1
git push -f  -u <REMOTE NAME> <branch name> 
# Make further changes. 
# Extend commit by staging the changes and running the following command, then push again.
$ git commit --amend --no-edit
$ git push -f  -u <REMOTE NAME> <BRANCH NAME>


# To refresh to repository for next PR, the following commands update the repository to contain only the latest HEAD commit. 
# 2 points to note
# 1) This erases local changes. So beware.  
# 2) If space itself is an issue too, also lookup `git gc` and `git repack`
git checkout master
git fetch --depth 1
git reset --hard origin/master

# Now, do a bit of github acrobatics. 
# Bring your github fork up-to-date with the original nixos fork. 
# Go to this link<your github user>/nixpkgs/compare/master...NixOS:master 

# Submit a pull request which would bring your github fork up to date. 
# Find the commit id in your computer's latest checkout. 
# Browse to that id in github's interface and browse files. 
# Create a new branch from that commit id on the github fork. 
# Create a new local branch using `checkout -b`. Work and commit as above

The always-innovative nix maintainers also plan to investigate applicability of sparse cloning and checkouts later (help is welcome).

1) This erases local changes. So beware. (They say “Nothing in git is ever lost” so use reflog if you’ve made a mistake)

… and they lie (and also backing up reflog is complicated short of full rsync); more to the point: any person who needs help to do a shallow checkout beyond «look up --depth for clone and fetch subcommands» needs some pointers where to seek help for woking with reflog because it is complicated.

If you discuss space concerns, should you mention repack/gc?

Also, maybe leave a remark that we plan to inverstigate applicability of sparse cloning and checkouts later (maybe this will encourage someone to experiment and report the results?).

1 Like

What single sentence should I write about repack/gc? I have never used either.

What single sentence should I write about repack/gc? I have never used either.

Maybe something like «If you are sure you do not want anything except the heads of currently defined branches, you might want to look at the ‘git gc’ subcommand.»

What I have never looked up is whether it is possible to reduce a checkout’s depth.

(I also wonder if the workflow described in this writeup ends up hitting some GitHub performance corner case; but as we have most users using some channels, it should not matter that much.)

Cool and where does this text go?

Timely post, for me. I’ll need to clone soon, and I’ve been intending to look into this (so I guess I’m in the target audience for the information).

On your primary where-it-goes question, I think tailored/concrete versions of this would go well in two areas:

  1. Contributing to Nixpkgs, particularly 17. Quick Start to Adding a Package. I think it would entail adjusting the existing chapter a little (perhaps even breaking it down into subchapters?) to make space for addressing the most-likely complications.
    • The most important information (about the repo size, why you might want to make a shallow clone, and how) could fit in with the existing step 1. there (or perhaps as a 1a. and 1b.?)
    • Not your problem, but the existing steps could probably use a new optional 2nd step that immediately addresses creating a branch for changes…
    • Fit the git push information in by expanding the current 7th step?
    • Address how someone with a shallow checkout should navigate the possibility that they’ll need to rebase against new work on master before their changes are ready for merge?
    • Address the “refresh” info more concretely for this use-case? (I assume it’s “fine” if they re-truncate the master branch, but that they probably don’t want to until their PR is merged?) in a note on the 7th step?
  2. Anywhere that discusses using a local checkout of nixpkgs (I’m a bit surprised not to see this use case in the nixpkgs manual anywhere, which seems like an oversight–but I do see 3 instances in the NixOS manual).
    • This context probably doesn’t need to address pushing the changes or ammending.
    • It can probably address re-truncating as just “updating your shallow clone” with a caution against doing this if they have local changes?
1 Like

Thank you for your comments, I sincerely appreciate you giving feedback. Personally I would prefer keeping steps very, very small and addressing only the most frequent use case. This is pretty confusing stuff anyway. I’ve expanded step 7.

A related, but not entirely same thing, is when I have multiple PRs ongoing, I don’t want to have multiple checkouts of the entire repo. This is where git worktrees come in handy: I have a bare checkout, and multiple worktrees, e.g.:

mkdir ~/programming/nix/nixpkgs
cd ~/programming/nix/nixpkgs
git clone --bare .git
# you can do `git remote add upstream
# Add worktrees for each ongoing pull request
git worktree add -b foo-init-at-1.0.0 ./foo-init-at-1.0.0
# cd into ./foo-init-at-1.0.0 and do stuff, you can pretty much treat it as a checkout
git worktree add -b bar-update-1.0.0-to-2.0.0 ./bar-update-1.0.0-to-2.0.0
# etc
# $ du -sh * .*                                                                                                      [1/69]
# 123M    bar-update-1.0.0-to-2.0.0                                                                                                                             
# 123M    foo-init-at-1.0.0
# 1.1G    .git # the full git repo, the worktrees only consume the actual checkout's worth of space