Should IPFS be used as a source for fetchurl in Nixpkgs?

cdepillabout · November 29, 2021, 1:51am

I maintain the package memtest86-efi. Recently there was a PR updating memtest86-efi to use an IPFS mirror in fetchurl. I’ve never seen this in Nixpkgs and I wanted to see how other Nixpkgs maintainers felt about this.

The problem with `memtest86-efi`

The problem with memtest86-efi is that it is a proprietary package, and the company creating it does not provide a versioned URL for the latest version.

The latest version of memtest86-efi is 9.3-1000. However, the download URL is https://www.memtest86.com/downloads/memtest86-usb.zip. When a new version gets released, it also has the same download URL. There are old versions available to download at versioned URLs. For instance, the previous version is 9.2-2000, and it is available as a versioned URL: https://www.memtest86.com/downloads/memtest86-9.2.2000-usb.zip.

The reasoning from the company for not providing a versioned URL is here: version 8.1 distribution file is not versioned - PassMark Support Forums. Note that redistributing binaries is OK.

Up until now, I’ve always just packaged the previously released version, since it has versioned URL.

IPFS for the source URL in `fetchurl`

There was a recent PR updating memtest86-efi to download the latest zip file using an IPFS mirror:

https://github.com/NixOS/nixpkgs/pull/147574

The benefit of this is that now the non-versioned memtest86-usb.zip file is content-addressed, so it doesn’t matter that it is not versioned.

There are a couple drawbacks that I see:

I don’t use IPFS, so I wouldn’t personally be able to update to a later version of memtest86-efi. (Not being able to update your own packages is quite unfortunate!)
From memtest86-efi: 8.4 -> 9.3.1000 by TredwellGit · Pull Request #147574 · NixOS/nixpkgs · GitHub and Pin files | IPFS Docs, it appears that this is now relying on a single(?) user to pin the files on IPFS. Without this working in any sort of official capacity, it is hard to know whether or not this is more reliable than downloading from upstream.
I couldn’t find any other packages in Nixpkgs using IPFS mirrors as a download source, so it is at the very least uncommon.

I wanted to get a feel for how other Nixpkgs maintainers felt about this.

uep · November 29, 2021, 3:48am

Can you predict the versioned URL that will appear, once the unversioned download gets upgraded to the next?

I’m thinking about a scheme where you have a fallback URL that gets tried on a checksum failure from the primary (unversioned) URL. I’m not sure if Nix has a fetcher like that already, but if it doesn’t it seems like something that might be useful generally.

cdepillabout · November 29, 2021, 4:01am

Can you predict the versioned URL that will appear, once the unversioned download gets upgraded to the next?

I believe you can.

For instance, memtest86-efi: 8.4 -> 9.3.1000 by TredwellGit · Pull Request #147574 · NixOS/nixpkgs · GitHub is updating to version 9.3-1000, which is currently available at the unversioned URL https://www.memtest86.com/downloads/memtest86-usb.zip. However, when the next version is released, 9.3-1000 will likely be available at the URL https://www.memtest86.com/downloads/memtest86-9.3.1000-usb.zip.

A fetcher that can fallback on a checksum failure seems like an interesting idea!

bbigras · November 29, 2021, 4:04am

Shouldn’t the web archive be used as a fallback instead of ipfs? Wasn’t there something about putting all the tarball we use in the web archive?

cdepillabout · November 29, 2021, 4:20am

I don’t know anything about this, but I’d be interested in learning more. I’m guessing the web archive provides content-addressed URLs as well? Do you have a link where I can learn more about Nixpkgs using the web archive?

NobbZ · November 29, 2021, 6:45am

They are not content addressed, but time addressed.

You specify a time and URL and the web archive will respond with what it thinks was current that time.

uep · November 29, 2021, 8:59am

I’m not sure they meant archive.org rather than tarballs.nixos.org.

cdepillabout · November 29, 2021, 9:51am

Ah, in the case of memtest86-efi, I’m not sure tarballs.nixos.org could be used, since the package is marked unfreeRedistributable. I don’t think Hydra will build it for us (so the sources won’t end up on tarballs.nixos.org.)

L-as · November 29, 2021, 12:03pm

Most source code archives will in the future use IPFS if they’re not already. fetchurl and similar should fall back to IPFS automatically in the case of a failure IMO.

L-as · November 29, 2021, 12:03pm

Perhaps relevant to @Ericson2314 because of Obsidian’s work on Nix on IPFS.

nixinator · November 29, 2021, 2:41pm

interesting stuff, like living in the future.

What happens when the user who has ‘pinned it’, doesn’t want to do that anymore. If multiple users are not pinning it , is that a ‘bus factor’ of one.

The gateways do cache it , even if it’s no longer ‘pinned’ on the network. However these caches do expire.

If IPFS is going to be used for anything in nixpkgs, then the pinning must be guaranteed in some way…the way i’m not too sure about.

it’s probably not far from fiction to write a script that scans nixpkgs, for ipfs hashes, and pins them. Rather like /nixpkgs/maintainers/scripts/all-tarballs.nix . Perhaps that’s some that the NUR can do, as it my be suited for ‘them’.

The comments from the author seems pretty weak, regarding direct linking, there are plethora of ways to stop direct linking of download files. Writing software is fine art, releasing and distributing software is a finer art.

L-as · November 29, 2021, 5:28pm

Somebody could take all common packages from the top level, and make sure their sources, if a FOD, are available through IPFS.

nixinator · November 29, 2021, 5:35pm

this rings a bell, i think it’s been tried before, but without a ipfs fallback for all fetchers…then it’s pretty useless.

But totally possible.

Ericson2314 · November 29, 2021, 5:40pm

Yeah in our implementation has IPFS work as a substitutor, so you can have a fixed output derivation like today that fetches any old way, but if you use an IPFS-compatable fixed output hash, it can also substitute it that way.

The SWH–IPFS work we will hopefully be able to start not too longer ago would mean there is a very nice peer of last resort for fetching sources.

Long term, I hope more original authors can host in a content addressed way (with IPFS would be great, or at least HTTP with some uniform URL scheme), GitHub and s3 can be persuaded to make content-addressed fetching easier, etc. etc. It will take a while to establish the norm, but the goal is to make sharing and archiving of source code not have so much friction.

The DHT is trickiest part of IPFS right now, but it would be interesting to compromise between content- and location- based addressing by smuggling some “hints” in fixed out derivations about what peer is likely to have a certain content address.

nixinator · November 29, 2021, 6:14pm

why is the DHT ‘the trickiest part’ ?

Ericson2314 · November 29, 2021, 7:22pm

The underlying “connect to peer and download/upload stuff” is very simple and reliable with IFPS. The DHT routing part is harder — the user finds the peer that supposedly has the stuff connecting to other peers which store the DHT nodes. There is redundancy, but still this part is just inherently trickier.

The IPFS people of course want to make everything work, and there are plans on how make the routing more configurable to help fine-tune tradeoffs. That’s all good, but my view is that also working on this sort of “federated” use-case where there are things like persistent peers known to pin certain stuff, and hints pointing to those peers, is a very good simple-stupid incremental step.

nixinator · November 29, 2021, 7:49pm

IPFS DHT has a global name space, so everything has to be in there. If ‘they’ can make that work quickly and reliably I’ll be proven wrong!

I’m currently working on some new ideas around hypercore, which has the concept of topic based DHT’s, or even independent DHT’s to make ‘cores’ around a particular (smaller) dataset . However the good folks over at NGI seem to be set on IPFS a solution to all trust-less p2p data distribution problems. However I’ll try and convince them that alternative solutions are probably healthy to try out.

Ericson2314 · November 29, 2021, 7:56pm

NLnet is definitely not dead-set on IPFS.

I just want the simple parts (connecting and exchange) to be layered separately from the routing parts. IPFS does actually do a lot of layering. I don’t follow hypercore and other stuff very much, but I don’t recall seeing much layering.

It would be better if people could agree on protocols there part, so we don’t have to wait with 0 interopt while people try different routing methods. That is surely a recipe for failure.

Ericson2314 · February 10, 2022, 6:24pm

I posted Nix and IPFS · Issue #859 · NixOS/nix · GitHub too, but per Building the technology needed to bridge the Software Heritage archive – Software Heritage we started work on the IPFS SWH bridge.

The idea with the prior IPFS Nix work was the IPFS should be a substitutor, not method of fetching. So you are always free to write down a “primary” URL to fetch from, and as long as you secure the fixed output derivation IPFS can act as an alternate way of getting it. SWH is already the archive of last resort for many things, so this puts a bunch of useful stuff on IPFS right out of the gate.

bbigras · February 11, 2022, 12:58am

Is ipfs needed, or can we fetch directly from SWH?

Should IPFS be used as a source for fetchurl in Nixpkgs?

The problem with memtest86-efi

IPFS for the source URL in fetchurl

The problem with `memtest86-efi`

IPFS for the source URL in `fetchurl`