How to make Nixpkgs more eco-friendly / use less resources

If it’s any consolation, I appreciate all of the work that you’ve done to extend the usage of Nix to CA derivations and IPFS.

I think it’s really forward thinking and helps demonstrate the power of using functional paradigms like merkel trees.

11 Likes

I can’t wait to see IPFS being integrated natively into Nix!

2 Likes

In the grand scheme of things that would surely have a negative impact: Instead of using our build farm, the work would be done multiple times downstream on less efficient machines.

I’m also a bit confused what the resources are we are talking about: Environmental Impact? Build time? Bandwith usage? These issues are interrelated, but I don’t think we can e.g. meaningfully discuss environmental impact without knowing what parts of our infrastructure have what impact exactly.

Build resource usage is also a tricky time because CA will probably make rebuilds faster, but I think it’s likely that the time previously spent on waiting for rebuilds, will then be used to schedule even more builds. Then it’d be more efficient, but the net resource usage would be similar…

Bandwidth usage is a tricky one: I think it greatly limits the accessibility of Nix/nixpkgs/NixOS outside of the West. However, as @NobbZ points out, we can’t eliminate the necessity to regularly redownload an entire system with the way Nix is designed, because we deliberately don’t allow cheating via ABI compatibility and dynamic library loading – as conventional distributions do. I’m not sure if it’s actually feasible to significantly reduce the output size of software packaged in nixpkgs – there’s probably a big element of fighting against the current of modern software development involved. The substitution mechanism is probably the area where we can have the biggest wins.

7 Likes

I mentioned a potential solution to avoiding downstream rebuilds from (internal) library changes before: build against library stubs, then relink against the real thing in a second derivation. If one wanted to not only avoid rebuilds but also minimize substitution, the second derivation should be built locally (though allowSubstitutes = false may be a bit too strong for that). Then only libraries/executables that truly changed should be refetched, as long as the stubs are unaffected.

4 Likes

This should be possible to do, purely, either with vanilla CA derivations and cleverness, or with a small extension to them.

2 Likes

There are other content addressed distributed file systems that are available too, think just because IPFS was first doesn’t mean its the best choice.

In fact the design choices of IPFS, if it gets popular , it’s going to have problems scaling to million of nodes, due to inefficiency in content discovery/publishing…it’s just one large DHT, but maybe they will copy the idea of topics soon.

However, let the best distributed content addressed file system win.

Many solutions you can’t directly share the /nix/store, but you have to have a copy dedicated to distributing over IPFS which leaves you quite a lot of redundant data. :frowning:

However, i need to check the IPFS nix extensions, and see if/how they address this.

I’ve reviewed the IPFS patches for nix, there certainly has been an extensive amount of work gone into them judging by the size of them.

Interesting stuff!

3 Likes

IPFS was not the first. Wikipedia describes a brief history of content-addressed storage. I recall exploring Tahoe-LAFS years before IPFS became available.

ok first to get $$$$$$ of funding… sorry for not making that clear.

Another avenue to explore would be to make it easier to share a nix store between local machines, right now it takes a decent amount of work for the end user to setup their machines as binary caches. It also comes with some weird gotchas so the experience is not very smooth. Fixing those problems has the potential to save bandwidth from the server and provide a better user experience for bandwidth constrained users, and is probably much more tractable than the optimizations listed above.

From an eco standpoint that would probably be only a modest win, if any, but it would be very appreciated by users with bandwidth constraints.

Without more concrete bandwidth and compute data on the current infrastructure though, everybody here is just guessing.

5 Likes

You could take a look at GitHub - cid-chan/peerix: Peer2Peer Nix-Binary-Cache

Otherwise, you can set up a local cache, but it’s not very practical. Like you need to make it the first substituter, and when it’s not reachable nix is unhappy :confused:

edit: I wrote about peerix

3 Likes

graham has all the answers (perhaps) , but i’m not sure if he is willing or can share that data , however there’s no harm in asking. If someone is willing to put time into a nix sustainability report. Whatever it reveals , nix is much more eco-friendly than gentoo ever was, as it has the possibility for a single compiled package to be used by many users via nix’s really clever binary caching features…

I’ve done some preliminary work using the hyper core file systems over fuse, but NGI were not too interested in funding it at the time

I think all eye were on IPFS projects… but maybe next year ;-).

@Ericson2314 thanks for you hardwork and research into this, and @Solene for asking quite a important question.

3 Likes

This thread’s title, and its implication that people compiling software for themselves is bad for the environment, really bother me.

Decentralization is an essential feature of healthy ecosystems; monocultures inevitably collapse. Diversity costs energy, and thermodynamic beancounting papers over this.

I identify as an environmentalist, and I bristle at the movement being reduced to mere joule-tallying.

3 Likes

As someone involved into BSD development, I agree with the diversity point. It’s bad to see only the combination of Linux/amd64

However this doesn’t prevent working on a more efficient diversity :+1:t3:

5 Likes

A few weeks ago I did a highly unscientific experiment: I took the NAR-serialisations of two random Firefox store-paths (uncompressed) and checked how much would be downloaded if one was present locally and the other was provided via zsync/zchunk. The results were disappointing: The downloaded amount was about as much as the xz-compressed NAR file in each case.
Things may be better with some NAR-aware chunking (at file boundaries, handling store paths).

3 Likes

There is also an upcoming hackathon: https://sdialliance.org/landing/softawere-hackathon/

3 Likes

They use this piece of software, to measure power consumption on a node. This then exposes the data via a REST API.

Maybe something like this could be used as a starting point?

1 Like

I enjoyed reading this discussion.

I wanted to add, similar to @sternenseemann, that we should probably analyze the resource usage of the Nixpkgs and NixOS ecosystem before discussing possible remedies. For example,

  • How much power does the Nixpkgs Hydra use as part of the build process?
  • What is the bandwitdth usage of cache.nixos.org? How does bandwidth usage compare to “power usage”? Is there a factor computing the CO2 equivalent?
  • Further, we could have a look at: ow much builds happen in a distributed way, for example, on personal computers?
  • Are there other parts that use resources?

We could also have a look at other parts of the Nixpkgs ecosystem such as the Nix Community builds and cache.

I am by no means an resource usage expert, but I do think the lowest hanging fruits should be eaten first.

With respect to local builds, I am using deploy-rs, and so, build all my derivations on one computer and transfer the outpus on the local network. Of course, this is not an option when the systems are owned or used by different people/entities.

1 Like

… for which a flake is available!

1 Like

It feels I’m not downloading as much packages a week as I used to? I suppose switching to flakes instead of using niv led to a huge improvement. I understand that niv was pinning a commit but sometimes not everything was rebuilt or I was able to update a lot more often, while with flakes, nixpkgs seems a bit more curated and the nixpkgs available through the registries seems always almost complete, and they are published every 5 or 6 days.

2 Likes

The flakes registry is not more curated than the git repository, since it is only an alias really. The only difference is that (I think) it defaults to the nixpkgs-unstable branch which should be a bit better than tracking master – however often you probably should use nixos-unstable instead.

It’s a shame that niv never got channel ergonomic support (only via the convenience branches we have in the github repository), npins finally offers that feature.

1 Like