Pre-RFC: Gradual Transition of NixOS x86_64 Baseline to x86-64-v3 with an Intermediate Step to x86-64-v2

what resources do you need?

maybe the companies that are interested in these optimisations can help with those resources ?

~100-400TB extra of S3 storage and maybe something like 1000ish cores of compute time to mass rebuild all of that when needed?

Given that no one proved they bring anything to the table, that’s highly uncertain.

1 Like

whoa…

ok, that’s a big ask.

Agreed.

(Playing devil’s advocate, I’m not personally convinced that -v2 does much performance-wise.)

FWIW, x86_64-v2 doesn’t mandate AVX. Steam’s hardware survey (one of the best public data source to look at, IMO) shows SSE4.2 at 99.52% availability, vs. AVX at 97.28%.

I don’t know if I generally agree with that. We probably exclude more users and interesting use cases by not supporting ARMv7 than we’d do by moving the x86_64 baseline to -v2. NixOS makes it trivial for users to build from source if they have specific architecture constraints, so “someone might need it” doesn’t seem like a super strong argument to me.

7 Likes

For machines out there that are used for gaming. Skips over lots of machines out there: servers, routers, workstations.

I’m pretty sure that due to growing system requirements, the average gaming PC is more modern, than everything else out there.

10 Likes

How do enthusiastic NixOS users go about testing the impact of these flags themselves? The last time I tried to set build flags to optimize for a specific x86-64 psABI level for my whole system following the guide on the wiki, it simply didn’t seem to work: Nix CPU global CPU flags - #2 by pauldoo

2 Likes

I would argue the only real reason we don’t support ARMv7 is that because it’s hard to have it in CI, we have a… surprisingly good and active maintenance of ARMv7 in NixOS (yes, people are running systemd with it and what not.)

3 Likes

In the past, I did the work to look into this, you can use two of my branches towards this:

They simulate what would be the changes to nixpkgs if we bumped the minimal baseline.

I built them over https://hydra.newtype.fr/jobset/nixos/trunk-combined-x86_64-v2 and https://hydra.newtype.fr/jobset/nixos/trunk-combined-x86_64-v3, but I think I removed recently the binaries because I wanted to bump with a recent unstable and use the new timeout features for tests because my Hydra often ended up stuck in NixOS tests for no reason.

If people are interested, I can rebase, clean up and ask Hydra to re-evaluate.

(The Hydra links are IPv6-only, I am sorry for people who may not have IPv6, I do not have money to spare on IPv4.)

2 Likes

Also, with glibc-hwcaps, shouldn’t it be possible to provide multiple compiled libraries in a single package. One could be for x86-64 and another for x86-64-v2, and a third possibly for x86-64-v3.

It would also allow for this to be enabled or disabled at a package level. Some performance sensitive packages could build for multiple levels (media codecs, compression libraries, etc), while others might opt to build only for the baseline (a basic text editor, mkfs, lots of other examples).

1 Like

This doesn’t change the storage costs.

If only someone can come up with a list of package that benefit from it.

1 Like

Surely it must. There is more to a compiled package than the binaries and libraries. There are all sorts of other assets. Using glibc hwcaps only the libraries and binaries are duplicated, not the entire package.

3 Likes

The note was pertaining x86_64-v3.

v2 is a much easier pill to swallow as hardware without SSE4 really is getting to the point of not being useful anymore as even basic ARM SoCs outperform the best CPUs from that era nowadays. Even there I’d err on the side of caution though.

With v2 however, the benefits are even more questionable than with v3.

I’m all for supporting armv7l-linux too. I’ve got two older RPIs that I’d like to put NixOS on.

Difference is that we never supported armv7l-linux to any decent capacity while x86_64-v1 has pretty much always been supported.

The problem is that we don’t know who might need it. It could be literally noone or thousands; we’re blind here.

That could probably happen organically.

For example, let’s say someone wants to compress their music library to a higher FLAC level to save on storage. Being a typical NixOS user, they might spend an unreasonable time optimising the re-encode to be a few minutes faster. Assuming such a flag optimising for separate HWCAPS was already proliferated in Nixpkgs, they might try it out to see whether it makes a difference and whip up a quick PR if it shows a significant benefit.

What I also like about the glibc HWCAPS approach is that we could optimise packages for even higher levels (i.e. x86_64-v4 with AVX512) where I wouldn’t be surprised if gains were quite significant without breaking the other >90% of users’ systems.

2 Likes

While researching the feasibility of glibc-hwcaps, I came across this wiki article from openSUSE discussing this topic. I’d consider it required reading.

https://en.opensuse.org/openSUSE:X86-64-Architecture-Levels

3 Likes

That could be a compromise, building specific stuff in v2, v3, for example we could build the kernel potentially in v2 and v3, and in general, I’m up for optimizing packages like Clear Linux does them.

Note that even building “specific stuff in v2, v3” will likely still preclude users as most of the interesting stuff to optimise is stuff people actually want to use. On systems with unsupported CPUs, those will just crash. After having thought about that a bit more thoroughly, I must retract the recommendation of that approach.

I think the most feasible way forward right now would be to build infrastructure to support glibc-hwcaps and then have it organically enabled it on packages where it makes sense.

I do not see the general target being raised any time soon unless the data situation improves; RFC or not.

4 Likes

Data is def one of the biggest problems, the lack of data, it quite a hard solution to find though because to benchmark this you would need a dedicated server right?

You could benchmark on the very device you’re on right now. As long as it’s reasonably modern (CPU from the past years or so), that should be fairly representative.

I guess (not certain) that this forum will have a bias towards people who don’t need to use old hardware.

FWIW, my main computer uses a Xeon Sandy Bridge from 2013, which I think supports v2 but not v3.

My laptop has a CPU from 2011 because developers with slower and older hardware necessarily produce more performant software. I also sometimes build with -march=native and -mtune=native, why can’t you ricers do the same?

Intel is going to keep bloating their CPUs regardless if makes the hardware better or worse because it’s an intrinsic part of their corporate culture. The Management Engine and Meltdown should have made it obvious that they won’t be held responsible for implementing bad features.

2 Likes

lol! That maybe me spit my (decaf) coffee out.

I suggest all nix developers declock their machines to the slowest speeds possible, remove as much ram as they can, installed a tiny disk, and connected to a 300 baud modem.

this should provide the productivity results we need.

are we allowed to have graphics? if so , how many colours?

1 Like