Suggestion/Feature: use Bittorent or IPFS to download packages

I did put some work into that a few months ago. It worked reasonably
well but I had that on weird idea that required me to add a patch to the
Nix daemon: GitHub - andir/local-nix-cache: A poor and hacky attempt at re-serving local nix packages that came from trusted sources

I do not recommend using that for anything important. In the mean time I
have had an idea to use IPv6 site-local multicast to exchange messages
between Nix-local-cache nodes.

1 Like

Hi,

Yes, sadly IPFS has not reached it’s goal yet. But i don’t see a
reason why they shouldn’t in the future.

Mh, IMVHO because:

  • it’s already a monster

  • no one else in the history reach a point of a fully scalable
    distributed solution, even in an ideal IPv6 scenario with a
    global address for any peer, without any kind of NAT or other
    obstacles, including dummy addresses via privacy extensions…

We know MANY fully-decentralyzed and free solutions, from email to
usenet, passing through a tons of less used/known solutions, they
scale, they work well, their complexity is an order of magnitude
simple than fully-distributed solutions, their performance tend to
be excellent. Of course they demand a bunch of servers, reachable
and known, but hey a domain name does not cost that much, at least
in the western world (witch is for now the biggest FOSS population
in the world) and personal servers from VPS to universities, voluntary
mirrors by various FOSS users etc are easy and numerous enough to be
an answer to avoid depend on a single entity/megacorp cloud not in
an hypothetical future but today at a price cheap enough to be easy
to migrate to a distributed solution IF we (society ad a whole)
ever create one…

What do you think about an architecture like Tox Bootstrap Nodes?

I know them too superficially to say something, for the little I know
they are all promise, mostly failed from the start though…

Do you see a pragmatic solution that my computer uses packages from
another computer on my local network to install packages? Computers
should share and announce it to the other. Maybe some zeroconf magic?
And nix-serve.

My knowledge of Nix is limited to answer BUT IMVHO it can be done, like
other package-managers do, and by itself it far reduce the dependency on
a central server: just a big company LAN instead of generate ton of
traffic to download identical bits can generate a single download and
spread it internally with FAR better performance and FAR little load
for Nix{,OS} infrastructure.

Recently Framasoft publish a nice vignette [1] that should be considered
with care by ANYONE interested in IT. The idea that “the cloud” can
scale, be cheap, be reliable, be friendly, at any scale, forever and
ever is like the ancient Simon’s Chronicle’s “hey we can use the network
as the backup, keeping pushing bit around instead of wasting disk
space”. A simple crazy idea that only people without ANY IT knowledge
and also not much logic reasoning capabilities can have and even if that
is evidently true still many, included high skilled people tend to
forget or not think about that.

A FOSS project that relay on centralized services is like a “free”
citizen in dictatorship, free only to the extent of the dictator
leash, working not for freedom but for the dictator itself that can
benefit from “free work” and say “hey, look, this is an opponent, it
speak, it be alive, so I’m not that bad”.

[1] https://framablog.org/wp-content/uploads/2020/03/installer-nos-instances.png

1 Like

that don’t mean it will never happen :slight_smile: i’m still optimistic, but not for the near future.

In this context, it’s just a program that downloads files from the network.

But we have many files in the binary cache, like millions. I don’t remember exactly, but maybe it was 80 TB. I can’t mirror that at home, so it would be great to just mirror some parts and others do other parts and together we have most of it.

So i configure my server to share it’s nix store and tell others to use it as a mirror? That should work and i think some do that, but it needs manual setup. It would be great if mirrors are found automatically. But we might just maintain a list somewhere. Like a GitHub wiki.

Hi,

In this context, it’s just a program that downloads files from the
network.

the issue is “how to reach someone to download from”, that always the
issue, not only because of NAT and the absence of widespread IPv6 with
a global static address per device (not counting privacy extensions)
but because even with all algorithms one can made the “real bootstrap”
of a fully distributed service is flood the entire network per any
request. “supernodes”, trackers, “bootstrap nodes”, various kind of
“metadata prefetching” do mitigate but means or not to be fully
distributed or to have a constand dDOS from peers reaching files…

But we have many files in the binary cache, like millions. I don’t
remember exactly, but maybe it was 80 TB. I can’t mirror that at home,
so it would be great to just mirror some parts and others do other
parts and together we have most of it.

No need for that. Just torrent out their own /nix/store. If I have
something I offer it, so others do. This means that the cache is a mere
backup and a tracker for any contributing Nix{,OS} user.

So i configure my server to share it’s nix store and tell others to
use it as a mirror? That should work and i think some do that, but it
needs manual setup. It would be great if mirrors are found
automatically. But we might just maintain a list somewhere. Like a
GitHub wiki.

Well… On LAN a fully distributed system might work, if the LAN is
little things like avahi prove to be effective enough, on internet scale
IMVHO only torrent prove to be a solution, all others do not offer
usable performance… About generic mirrors: many distros do have them.
So perhaps is only a matter of popularity and age: ancient distros are
born in an era where mirror the distro you use if you can was common,
these days people do not think about that, they tend to consider the
network as “the nature”…

– Ingmar

Just for the record, 90% of what Cachix served in February (~10TB) comes from CDN cache at basically unlimited speed. It’s going to be hard to beat that.

2 Likes

Hi,

Just for the record, 90% of what Cachix served in February (~10TB)
comes from CDN cache at basically unlimited bandwidth. It’s going to
be hard to beat that.

It’s not a matter of performance: CDNs are servers operated by third
parties that today might work well, might offer free services for
certain projects etc, but today.

Change form that model to another if someday it can’t be used anymore
is not a quick thing. Have “another way” operational and tested means
switch for a potential disaster to a potential marginal issue.

BTW if I can update my NixOS systems from a single personal server
no CDN can beat my LAN speed. And if that server is actually any of
my NixOS instances with a single line in my config, an open port with
a simple service I can trust well… It’s even better.

Framasoft recently publish a nice vignette [1], it teach a lesson many
should learn IMO…

[1] https://framablog.org/wp-content/uploads/2020/03/installer-nos-instances.png

1 Like

Overall, after years of following this nix+CDN thread, I see problems in the motivation/difficulty ratio:

  • Designing a good system with IPFS aims is hard (apparently). And if someone manages it, can they expect lots of profit to recover the large investment? Centralized CDNs seem both easier to design and easier to monetize.
  • If the new solution won’t allow us to (eventually) shut down the current cache implementation, the motivation is rather lowered, unless we can significantly outperform it in some way.
    For example, we already had a running prototype of homemade CDN service (not based on IPFS but relatively usual design of https servers updated through rsync). It only served binaries from recent channels (not all those 100+TB)… nice, but typically you wouldn’t get better service from it than from our official CDN – therefore almost noone used it and it cost money to run, so understandably it was shut down after some time.
  • Speed of LAN is certainly superior, but I think such use cases have far simpler solutions than building a decentralized CDN :wink:
4 Likes

Related - [bug#52555] [RFC PATCH 0/3] Decentralized substitute distribution with E

This is an initial patch and proposal towards decentralizing substitute
distribution with ERIS.

(I did some work with ERIS in my Sigil project, but not as a distribution method.)

There is an unfortunate name clash with the Eris binary cache project, which I had not noticed until now.

1 Like

One of nixos infra servers is also informally called eris, I think.

I don’t understand where the security issues come from, probably I’m missing context or making wrong assumptions. This is roughly how I imagine the decentralized distribution of nixpkgs:

  1. Nix wants to build or find a binary for my derivation /nix/store/lpas4fgrsrcqzvygkzp2v1gmksm1xfpn-firefox-105.0.1.drv on my machine. Maybe this derivation has private information as input.
  2. Nix asks cache.nixos.org or other trusted cache if it has a binary for that derivation, only the name lpas4fgrsrcqzvygkzp2v1gmksm1xfpn-firefox-105.0.1.drv is transmitted. Even if my derivation contains private information, only its hash is transmitted.
    Alternatively, Nix downloads a package list from cache.nixos.org and checks if it contains the binary for my derivation. If not, no private information was transmitted.
  3. If cache.nixos.org has a binary for the derivation, Nix downloads a .torrent file or signature. It can then download the built derivation using Bittorrent/IPFS/… safely.

Interesting discussion in any case :slight_smile: I get the impression that your goal is to also decentrally share binaries built by “normal people”, not hydra?

Actually, thinking more about it… anyway we need hydra to push a file
that associates the .drv hash to the fixed-output hash. There is a much
simpler solution, then (though not optimal): just symmetrically encrypt
every derivation on the p2p network, and have hydra publish the keys
to decrypt its builds along with the .drv → fo hash mapping. And it
also avoids the case where weak secrets could be guessed, because an
attacker would just not have the encryption key for the derivation.

And in this case locally built packages would be encrypted with a personal key, which I only share with people I trust?

1 Like