How does it compare to Nix-casync, a more efficient way to store and substitute Nix store paths?
there is a work in progress https://obsidian.systems/blog/nix-ipfs-milestone-1
Wonât the binary change every time the store path of one of its dependency changes? The binary rpath must contain those paths.
The Solaris IPS packaging system did away with transferring package archives entirely, and transfers individual files, with a content-addressing scheme. This was in large part because even when a package update changes something, many files are unaffected and can be reused between versions.
While you might lose a little efficiency from single-archive-compression (many doc files together, etc) it seems that this efficiency comes instead from avoiding repeated downloads.
Because of reproducibility, itâs hard to skip rebuilds, guix has a âgraftâ system for packages update with really minor changes to avoid recompiling the whole dependency graph, but I suppose it kills reproducibility?
There already is, in the typical Nix fashion itâs called replaceRuntimeDependencies
. It works by going through every file in the system and replacing the store path of the original package with a replacement.
It wonât work for any package update, though: it assumes the replacement store path to have the same length as the original. So, you could do this for security patches and minor bug fixes.
If the goal is to increase NixOS sustainability, Iâm not sure using IPFS will be an improvement: the IPFS node software is a big resource hog. I admit itâs been a few years since the last time I tried it, but the node in idle would use a few cpu percents and several Mb/s of bandwidth with just a handful of pinned files.
IPFS is all nice (at least in theory) and it gives file level granularity, but we could probably go away with just distributing nars using plain old torrents much more efficiently.
IPS was incredibly slow, Iâm curious to know if this is due to updating files one by one.
IPFS draws CPU when itâs actively contributing to the P2P network routing, this is not mandatory and not really useful if you use it locally to access IPFS content. It got a better wrt resources too.
Iâm not sure exactly, and honestly my experience wasnât too bad with it, but Iâll posit some combination of:
- insufficient download parallelism and/or streaming with earlier HTTP
- iops amplification and latency cascades with small files and duplicated metadata updates
- likely running on early zfs, which had some pretty strong transaction commit latency
- likely running on spinning media with short concurrency queues
- conservative sync writes in the package manager
- the need to keep two copies of the file (or hardlink in some cases?); one in the store and one in the system
- different expectations; it may have seemed slow compared to (say) apt (on ext4), but it was already faster and more convenient that the previous Solaris pkg system so there was plenty of room to start with conservative implementation with room for further optimisations
I wrote Spongix after evaluating nix-casync. Compared to it, Spongix offers:
- garbage collection based on LRU/max-cache-size and integrity checking
- metrics
- proxying and caching multiple upstream caches behind itself
- uploading chunks to S3 compatible stores
- signing narinfos if no signature is present (for automated build farms on top of Cicero)
It probably has a few more features that i forgot about, but weâve been running it in production at IOG for a few months now and it performs much better than our previous setup with Hydra->S3.
CA derivations are strictly better, but It is really hard to get shit merged in this community â thus Hydra still doesnât have support for them upstream, and we cannot being testing thing in Nixpkgs.
I remain incredibly frustrated that we have this thing 90% done, and there is no will to unblock willing maintainers ready to do work getting the feature out in the real world.
Part of the problem is that the roadmap targets 3.0 for flakes and CA for 4.0.
We are not 3.0 yet, and flakes are far from being ready, despite what Eelco saysâŚ
And last time I tried CA, despite having set up the CA enabled cache, it bootstrapped some compilers, which ultimately caused a world-rebuild. Also cachix does not yet support CA.
And what annoys me with CA: Its design still relies on a single authority per cache to transform IA to CA as known by that particulat cache.
What we need instead is a distributed âtrustâ network that can point from IA not only to the CA, but also to a âstoreâ/âmirrorâ/âcacheâ/IPFS or whatever to download it.
3.0 is already a big giant mess that we cannot review, we need to minimize the scope. Flakes is a huge ball of unaudited complexity we are in no way ready to stabilize in one go.
I am not saying Flakes should be 4.0 and CA 3.0, but we should focus on layering so we can stabelize e.g. part of the new CLI (stuff like nix show-derivation
) without worrying about Flakes.
This is an awesome tool!
Could this be modified to be used by end users (people not building projects) as a way to improve nix efficiency?
I run it on my personal machines to save a lot of bandwidth alongside regular automatic Nix GC runs. So thatâs totally possible, yeah.
I currently use a nginx reverse proxy as a substituter caching packages on my LAN, spongix can be a great replacement.
I fail to see how running spongix on a single machine can help? You still have to download packages. Is it because it allow you to retrieve packages that has been GC? If so, there is a design issue here, you have a local package cache used to get packages that got GCed from the store.
I think phrasing this from an âeco-friendlyâ standpoint is rather pointless. 130TB is nothing (fits on 2 hard drives).
I think phrasing this in terms of build turnaround time is more effective and is a better motivation to address these issues.
Hopefully Content Address packages, can lead to the holy grail of build farmsâŚ
distributed trustless buildsâŚ
I know Adamâs and co have been working on trustnix, a distributed build system⌠imagine your builds being done on machines directly connected to renewable resources, or where ever the sun happens to be in the sky on earth :-).
If this can be somehow linked to a way for builders (miners) to get rewarded for building derivations for others then centralized building can a thing of the past.
Hydra goes from 1000 cpuâs to many many thousands, use ipfs or hypercore to distribute said buildâŚ
a bit star trek but if works, itâs probably change the course of software building and distribution for everâŚi mean, you canât expect do a nixos rebuild switch on mars, and fetch it from earth over a rather low bandwidth and âslightlyâ high latency tcp connection can you. ???
Nix = $ would never be a truer word then.
Trustless distributed build would probably a long time in the future. From my understanding, Trustix is more about checking the build from independant builder. The problem is that being able to demonstrate that two builder are independant is hard, and you need to trust someone or something to demonstrate they are independant (but it would certainly still be less trust that trusting a single hydra provider).
However, safely distributing file distribution is totally possible with IPFS (or certainly others) (which mean Inter Planetary File System btw). Possibly including a signed âinput â CA (and IPFS) hashâ (with IPNS possibly being said signature). (but no one will probably have a latency more than a few seconds before a long time, althought this may help with load balancing and performance).
Also, the centralized architecture as used today probably doesnât prevent using certains host under certains condition (if we ignore the fact that this might not even be a good idea to begin with due to fabrication cost, but Iâm not knowledgeable with this). Internet is probably fast enought that transfering data to the other side of the world is totally possible and easily achievable. (even thought the builder still need to be trusted sadly).