Hi all,
I am working on porting HHVM to Nix, and successfully make it work when sandboxing is disabled. HHVM was a package in nixpkgs however it was removed because the derivation was broken so I hope I could add it back to nixpkgs.
Unfortunately, the build script of HHVM would download its dependencies from internet, preventing it from building in a sandbox. I wonder if we have any general solution to deal with this situation?
I have some ideas like this
I would propose a more general solution, providing a special HTTP proxy to record URLs and hashes to download:
As a package maintainer, the usage would like this:
Setting up the HTTP proxy as part of the derivation
Adding a passthru.updateScript that would build the derivation with --no-sandbox
Executing the updateScript, which will trigger the special proxy to record files downloaded and the proxy will update the nix file to include recorded URLs and hashes as inputs.
Building the derivation again with --sandbox. This time the special proxy should redirect HTTP requests to these recorded files, which should have been downloaded locally because they are part of the derivation inputs.
I just wonder if there is anyone who have any attempt of a similar approach or if there is a better way to deal with the upstream build scripts that download files.
That approach is cute, and would certainly work in general. You run into problems of when and how you collect those artifacts, and how you persist them, though, especially with proprietary applications where you might not be allowed to redistribute the downloaded artifacts.
It also really doesn’t help if the build script downloads binaries that need to be patched before they work on nixos, which is usually when these scripts are actually problematic.
The best option is to (very nicely and patiently) ask upstream to make their scripts not require downloads at build time through a switch (or just in general), by permitting a pre-download step, whose individual downloads can then be predicted and provided by fetchurls before you ever run the script. Bonus points if they give access to information of what they want to download in a standard file format, ala npm lock file.
I think this is preferable, because these download-at-build strategies need to end, and every upstream we cause to at least think about it is a step in the right direction. If we get as much mindshare as Debian one day this will have a huge beneficial effect on the reproducibilty of builds in general.
If that fails, or you’re just too hesitant to be that ideological about your packages, or just need something in the mean time, I think patching the build scripts to no longer download things at build time, and managing the downloads by hand or by parsing their build scripts, is the correct solution.
You can also fall back to FODs with a suitably neutered version of the script that only downloads but doesn’t build. There are a lot of problems with this, but it’s the most common approach I think.
I once tried the proxy approach but it didn’t work because most of the web uses https, which means the proxy can see the server and port, but not the full URL.
@Atry , I’m very interested in this idea of using a proxy after seeing the brilliance of nix but fighting language tooling (pip, mix) and going through multiple *2nix solutions with broken edge cases to get it packaged right.
My intuition is it’ll be an elegant alternative to the current method of flipping installers/builders inside-out to work with nix.
You run into problems of when and how you collect those artifacts, and how you persist them, though, especially with proprietary applications where you might not be allowed to redistribute the downloaded artifacts.
I have an idea in mind, basically persisting a custom lock file and also persisting ephemeral artifacts similar to the idea of an impure translator in dream2nix
Anyways, is anyone working on this? I’d love to partner up.