Targeting particular x86_64 generation (haswell/ivybridge/skylake/...)


#1

There are programs (arangodb to name one) which can be compiled for a particular generation of x86_64.

Defaulting to the builder’s CPU, it results in binaries which do not run on the older. Also, it introduces non-determenism, this way builder’s CPU type leaks to the derivation.

Chosing the oldest platform is not good performancewise.

So it would be nice to be able to compile for exact processor where the binaries will be run.

Recent split if i686-linux to {i486-linux, i586-linux, i686-linux} gives a hope that a similar split eventually will be done with x86_64


#2

The usual approach is that upstream determines parts of code where it’s worth it, sets up the build system to compile multiple variants and adds runtime switches among them. It’s been common e.g. for multimedia packages (like ffmpeg), for many years IMHO, and it seems the best approach to me… but we can hardly keep maintaining such patches if upstream doesn’t want to support that.

The fact that we added these platforms doesn’t mean you’ll get much binaries for them in the official cache. It would be a huge combinatorial explosion.


#3

ArangoDB case is not about a little chunk of assembler code, it is about compiling whole derivation (~200Mb of binaries) with -march=skylake -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx -mfma -mbmi2 -mavx2 -mno-sse4a -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi

I find the official cache problem a bit irrelevant. The official cache supports x86_64-linux so it supposes to host binaries which must run on the oldest x86_64-linux and if they host skylake-optimized versions the same derivation must have CPU-detection code and runtime fallback. So in any official derivation there must be the version (the fallback or the only one) targeting the oldest x86_64-linux, which is typical for e.g. multimedia but difficult for e.g. databases or linux kernel.

The only problem with the official cache is it might already have binaries which fail to run on sandybridge as the result of the official builder runs on skylake and its CPU type and features was detected on the configurePhase and leaked into compiler flags (I suspect fontforge could be one of those derivations)


#4

Also I could have sense to patch (or wrap) gcc to fall if executed with -march=native because it makes the resulting derivation non copyable to an older computer.
There is explicit platform.gcc.arch to specify the oldest common architecture


#5

#6

Yes, we need to find those particular cases and patch them. Hydra.nixos.org machines are quite a variety of machines, so the result wouldn’t even be always the same.

We could make a cc-wrapper variant that detects these flags, but that would be a bit complicated, as we surely do want to allow -march=whatever outside nixpkgs while using compiler from nixpkgs. I’m not sure it’s worth it.


#7

We could make a cc-wrapper variant that detects these flags, but that would be a bit complicated, as we surely do want to allow -march=whatever outside nixpkgs while using compiler from nixpkgs. I’m not sure it’s worth it.

I think there are already some limitations we enforce for most of Nixpkgs but allow lifting by a build-time environment variable; I don’t see why it would be unsuitable here. Reproducibility might be worth it…

(As for x86_64 subtypes, I would support it, if mainline Nixpkgs part of the change is a list of platforms plus a low number of changes to fix platform specification in compilers as needed; just what is enough to allow overlays to work sanely with generation-specific or vendor-specific package flavours)


#8

This also raises a problem that we cannot build on an older processor targeting a newer one (the compilation is ok, but small programs compiled on configurePhase, tests, … do crash).
So, remote builders should expose their gcc.arch


#9

That’s already doable via supportedFeatures on machines and meta.requiredSystemFeatures on derivations, though it’s not “standardized” for this use case ATM. Well, so far we seemed to be mostly OK with generic binaries like all larger distros (AFAIK, discounting source-focused ones like Gentoo), at least for the official NixPkgs; IMHO it’s not too difficult for users to override some “leaf” packages for this purpose and keep rebuilding them.


#10

Yes, it would be perfect, assuming there is only ordered gcc.arch.

Although it won’t cover well the cases like Thinkpad X220’s

gcc = {
  arch = "sandybridge";
  extraFlags = ["-mavx"]; # -march=sandybridge alone does not enable AVX
};

#11

I don’t think it’s worth aiming for 100% accuracy, more like meaningfully splitting the “range” between plain x86_64 and new CPUs; just creating a few groups will probably be enough to get almost all the effect. Actually feature flags like -mavx feel more significant for this. The -mtune part of -march would be more relevant for old machines, I think, as the default/generic optimizer in GCC shifts in time towards the commonly used CPU types.