Point "import" statement to git URL?

As a newb to NixOS, I’m trying things that I think would be intuitive, but that end up not being possible. For example, I thought I’d try to import a nix package from GitHub:

nix-repl> import https://github.com/svanderburg/nix-patchtools/archive/master.tar.gz   
error: string 'https://github.com/svanderburg/nix-patchtools/archive/master.tar.gz' doesn't represent an absolute path, at (string):1:1

nix-repl> import <https://github.com/svanderburg/nix-patchtools/archive/master.tar.gz>
error: syntax error, unexpected '>', at (string):1:76

That doesn’t work.

I can make it work, by doing this:

# nix-channel --add https://github.com/svanderburg/nix-patchtools/archive/master.tar.gz patchtools
# nix-channel --update patchtools
# nix repl
nix-repl> import <patchtools>                                                        
«derivation /nix/store/76pyvhb67ybid35dscpr2h9lidpsimb3-autopatchelf.drv»

Is there way to work with Git URLs directly, so that I don’t have to manually add the channel and then remember what I named it (which means if I update the name, I need to update all the sources).

If there isn’t a way, I think this would be a convenient feature.


Suppose I want to depend on tens or hundreds of channels for various tools. It would get out of hand and error prone if I were to manually add all those channels when bootstrapping a new system.

1 Like

To include a nix package directly you can do something like


nix-repl> import (fetchTarball "https://github.com/svanderburg/nix-patchtools/archive/master.tar.gz")
[0.0 MiB DL]«derivation /nix/store/76pyvhb67ybid35dscpr2h9lidpsimb3-autopatchelf.drv»

or other" fetchX functions like fetchFromGitHub.

The <channel> syntax only works for channels added with nix-channel or via the -I flag in some of the nix tools as far as I know.

Adding them as a channel might be a better option if you only want to update all channels and packages at once, e.g. via nix-channels --update. If you use fetchTarball like above the package is only cached for an hour by default is fetched after that again. This means on the one hand your build will need more time, on the other hand rebuilding your system will update those packages. I personally have a shell script adding all channels my config depends on.

In my case I use packages from the Nix User Repository which can break. I moved NUR to a channel for my system to avoid those issues until I issue a channel update.

In the future hopefully “flakes” will hopefully improve the workflow when you want to import external Nix packages, but I’m not sure how well they work at the moment.

4 Likes

To be more precise, it has nothing to do with channels other than that channels are added to NIX_PATH by default. Rather <foo> syntax points to file system path located using NIX_PATH (and some commands allow extending NIX_PATH with -I flags). See paths bullet point in the Simple Values section of the Nix manual and description of the NIX_PATH environment variable for the details.

2 Likes

Nice to know about combining import with fetchX like that.

What if we use a line like

import (fetchTarball "https://github.com/svanderburg/nix-patchtools/archive/6cc6fa4e0d8e1f24be155f6c60af34c8756c9828.tar.gz")

? Will it see the commit hash and assume to cache it indefinitely? Or what about fetchFromGithub, does that cache indefinitely based on hash?

import just requires a path to nix file or directory containing default.nix so any such thing will work:

nix-repl> fetchTarball "https://github.com/svanderburg/nix-patchtools/archive/master.tar.gz"
"/nix/store/y4nb70jhpnfq4rf2r9y5bh8vhb43j0a4-source"
$ ls /nix/store/y4nb70jhpnfq4rf2r9y5bh8vhb43j0a4-source
autopatchelf  default.nix  examples  LICENSE  README.md

builtins.fetchTarball allows you to omit hash and is cached for 1 hour by default as per the docs. Actually, as you see above, it creates a regular store path which will remain there until garbage collected, it is the connection between URL and store path that is cached for an hour. And I do not think GitHub commit URLs are treated in any special way at the moment.

If you add a hash, it should behave like a regular fixed-output derivation.

fetchFromGitHub comes from nixpkgs and requires you to specify hash.

Thank you! I understand more now.

If I run the same fetchTarball call (without a sha256 hash) after an hour has already passed, and have not ran any GC yet, will it re-use the existing one or does it get a new one regardless?

Does running GC clean up the fetchTarball result only after an hour has passed, or cleans it up regardless?

You showed that it stores the result in the nix store, but the doc says it caches in ~/.cache/nix/tarballs/. So it seems that there are two caches. I’m trying to understand how they interact (f.e. what happens if tarball cache is expired after an hour, how does the nix store path come into play when running the same fetchTarball command, etc?)

What the builtins.fetch* functions do is that they download the file and add it to the store as fixed output derivation. They create a symlink from hash of the URL to the FOD in the ~/.cache/nix/tarballs. Relevant source code:

After an hour, Nix will no longer be confident the URL corresponds to the same fixed-output derivation so it will download the file again. When adding to the store, it will notice the same FOD already exists but at that point it was already downloaded and extracted. If you want to make sure the file is only downloaded once, you need to pass it a hash so that it can match it with FOD in the store.

nix-collect-garbage will delete all files in the /nix/store not tethered to a GC root. Since ~/.cache/nix/tarballs are not part of any GC root[citation needed], the FODs will be removed regardless the time. The tool does not know about the tarballs directory so broken symlinks will likely remain there.

The fetch* functions produce a regular FOD so that will naturally end up in the store. It is not caching any more than when the result of nix-build ends up in the store. The tarballs cache directory is simply there to remember which FOD corresponds to a URL even when no FOD hash is passed to the fetch* function.

What the fetchTarball essentially does:

cachelink=~/.cache/nix/tarballs/$(base32-encoded-sha256 "source\x00https://github.com/svanderburg/nix-patchtools/archive.tar.gz")
if $cachelink exists and up to date; then
    return "${cachelink}-file"
fi
curl -L https://github.com/svanderburg/nix-patchtools/archive/master.tar.gz
tar -xzf master.tar.gz # and name the extracted directory source
storepath=$(nix-store --add-fixed sha256 --recursive source/)
ln -s "$storepath" "${cachelink}-file"
write metadata into "${cachelink}.info"
1 Like

Would it make sense for GC to know about those links? Seems like an unwanted side effect that now we have to think about (f.e. a user may not know garbage is left behind after GC).

Wouldn’t it be better if the links from ~/.cache/nix/tarballs were considered GC roots, so when they are updated to point to a new nix store object that then the old ones get GC’d? And wouldn’t this also prevent unwanted garbage in ~/.cache/nix?

Well I guess this is off topic from my original question here. I marked your answer about using fetch* functions as the solution. :slight_smile:

It makes sense for something to clean them up. I hope there’s already some process that does this, though I see 2-month-old links in my ~/.cache/nix/tarballs so maybe not.

The tarballs aren’t trusted after an hour, so we definitely shouldn’t be GC rooting them indefinitely. If we could GC root them for a single hour that would make sense, but that would require the GC-tracing code to understand these particular links and how long they’re valid for, which seems like a lot of complexity for the relatively low gain of “if you collect garbage less than an hour after fetching a tarball, you might have to redownload it if you want to use the same tarball again before it expires”.

As far as I can tell, Downloader::downloadCached is the only place the directory is used in the Nix code base. I suppose it is only subject to the clean-up procedures user manually sets up for XDG_CACHE_DIRECTORY.

Unfortunately AFAIK there are no such procedures on macOS as macOS doesn’t know about the XDG structure at all.

Which reminds me, I should probably exclude ~/.cache from my Time Machine backups.

1 Like

Neither does linux nor windows… Its a convention that applications adhere to to organize their stuff, and they can do that on any operating system. Though it might be hard to agree on the defaults on a windows systems…

Interesting input you all.

In the end, the reason I asked the original question is because I have a local folder with .nix files and I’d like to test them as I edit them.

I can:

  • push the code to git and usw a fetchX call in an import statement to test it. But then the cache might be in the way if it doesn’t download the new version for another hour. Using fetchGit with a new commit hash of each change would work.
  • Manually make a tarball then point the import statement to a path that contains the tarball.

Either way, both options are more work than ideal. Is there another way in which I can use my source folder directly?

What’s the structure of your folder? You could write a file that constructs an overlay based on your folder and imports nixpkgs with that overlay, but the specifics of doing this depends on what exactly is in your folder.

There is a common pattern:

  • Use FOD with hash for reproducibility
  • Allow overriding it for development nix-shell --arg pkgs "(import ../my-local-nixpkgs {})"

You can see that in the following example:

@jtojnar I’m not sure what FOD stands for, but I see what that code does. :+1:

@lilyball I’m not familiar with overlays yet. What I’ll do is open a new topic now that the original question in this one is covered. EDIT: The new topic is at How to use a nix derivation from a local folder?.

Thank you both for taking the time to explain!

FOD = fixed-output derivation as mentioned above.

Not sure what do you mean by that import is just a function that takes a file system path to a Nix file (or directory containing default.nix) and returns the Nix expression from the file (must be syntactically valid), see the linked description for more details.

1 Like