Frustrations about splicing

waffle8946 · July 27, 2024, 4:24am

A lot of this is actually in the nixpkgs manual, but I consider it really overwhelming due to too much prose and too many analogies. Let me try to summarise.

The three platforms are relevant when building a compiler:

buildPlatform = platform where the compiler is built
hostPlatform = platform where the compiler is run
targetPlatform = platform for which the compiler emits code, if the compiler has the limitation of only being able to emit code for a single platform

The word “compiler” here is relevant, because if you’re not building a compiler, then targetPlatform is a bit nonsensical here and we should really just care about the first two platforms. (s/compiler/package/g, in that case…) Moreover, if you can choose the compiler’s target at runtime, then targetPlatform becomes irrelevant.

So, I’ll discuss for now the case of a non-compiler package, where we only care about build and host, because that’s really what us mere mortals are generally dealing with.

In such a case:

depsBuild*: things that should only be available at build-time, i.e. never get into the runtime closure!
- depsBuildBuild: used to emit more build-time code (e.g. some executables/libs used to build build-time tooling)
- ~~depsBuildHost~~ nativeBuildInputs: used to emit runtime code (e.g. build-systems for your package, like cmake, go here)
- depsBuildTarget: ~~used to~~ just don’t.
depsHost*: things used at runtime

In other words, depsHostHost and ~~depsHostTarget~~ buildInputs are the same, but we prefer calling the runtime deps buildInputs for no reason other than historical precedent.

But for the sake of strictness in this case, I will pretend that depsHostHost refers to buildInputs in the non-compiler case since, again, target is nonsensical.
depsTargetTarget: did you forget we don’t care about target?

To summarise: in a non-compiler package, we only use:

depsBuildBuild,
~~depsBuildHost~~ nativeBuildInputs, and
~~depsHostHost~~ buildInputs.

By this logic, to answer @AndersonTorres’s question:

extra-cmake-modules (ECM) is a (KDE???) package which contains instructions for augmenting cmake to find various commonly-used binaries and libraries at build-time. In other words, cmake is using ECM at build-time, for build-time, i.e. you do not want ECM in the runtime closure!

So strictly speaking, cmake goes in depsBuildBuild, ECM goes in nativeBuildInputs.
If cmake is also used to build the actual package, then cmake will also go in nativeBuildInputs.
This duplication is fine, because the two uses of cmake in these lists mean two different things.
Finally, whatever libraries you’re linking against, including those mentioned by ECM, will go in buildInputs.

Technically runtime binaries could also go in buildInputs, but AFAIK we don’t have any hooks to wrap the resulting package with binaries from buildInputs (we usually just use symlinkJoin or an explicit makeWrapper call), so it’d be kind of useless to put binaries in that list.

Oh yeah and uh, let me note the irony of mentioning a KDE package, because Qt cross is broken in nixpkgs lmao

Maybe I’ll write a splicing essay tomorrow…

rnhmjoj · July 27, 2024, 10:38am

If we strive for an always-on and ubiqutous cross-compilation, every layman maintainer should be able to easily write a package that works (provided upstream is not broken, of course) without a mandatory 2 weeks course on cross-compilology.

emily · July 27, 2024, 11:57am

Oh, so I did make things worse in one respect with https://github.com/NixOS/nixpkgs/pull/329470…

Well, I guess since Qt cross is broken anyway it doesn’t really matter right now.

waffle8946 · July 27, 2024, 3:23pm

I think anyone who is cross-compiling will know the difference between buildPlatform and hostPlatform intuitively. So unless they are packaging a compiler, they really only have to ask themselves two questions:

Does this go in the run-time or (exclusively) build-time closure?
- If run-time → buildInputs (probably.)
- If build-time → does this emit code for run-time or (exclusively) build-time?
  - if run-time → nativeBuildInputs
  - if build-time → depsBuildBuild

And usually they might not even have to ask the second question, because most of the build-systems are emitting code for runtime, so guessing and putting it in nativeBuildInputs works fine most of the time.

Maybe, if we used consistent naming, this would become a simpler set of questions.
The first platform refers to the type of closure (Host vs Build).
The second platform refers to what it’s emitting code for (Host vs Build).
So, a new user might more easily remember the difference between depsHostHost, depsBuildHost, depsBuildBuild.

No need for target offsets unless you are building a compiler(-like) package, and that compiler has gcc-like limitations, or if you’re messing with stdenv bootstrapping (in which case you’ll suffer no matter what we call it).

But even if changing the names is unfeasible right now, I think the decision tree above suffices.
And if we arm people with that knowledge, we can get closer to flipping strictDeps to true by default

(PS “provided upstream is not broken, of course” is carrying more weight than I’d hope. We do in fact see broken upstreams all the time, so…)

AndersonTorres · July 27, 2024, 3:44pm

What qualifies as a “layman maintainer” in a project like Nixpkgs?
Two weeks? It is a very short time, I would say!

rnhmjoj · July 27, 2024, 3:57pm

I think anyone who is cross-compiling will know the difference between buildPlatform and hostPlatform intuitively.

This is the problem: most maintainers are not interested in cross-compiling, therefore their packages are more often than not broken.
Nixpkgs should try to make it easy to have cross-compilation, even if the maintainer does not care about all these subtleties except for build vs runtime dependencies.
At least once something cross-compiles, it should be hard to break it.

Maybe, if we used consistent naming, this would become a simpler set of questions.

Maybe, but I don’t know if the GNU convention helps: I somehow always forget what these names mean right after I’ve checked the manual, which I do whenever I touch something slightly involved with cross-compilation.

What qualifies as a “layman maintainer” in a project like Nixpkgs?

The small time contributor that has just packaged a small tool/library (which doesn’t cross-compile and later becomes required by a larger project), or that has just fixed an issue by adding { libfoo = libfoo_2_1 } and doesn’t know about splicing.

waffle8946 · July 27, 2024, 4:03pm

I agree; if we got something working under cross, an action we can take today is to simply put strictDeps = true; in the expression Then the native builds will have a more-similar environment to cross-compiled builds.

And if we can’t get cross working, put strictDeps = false; to explicitly flag it for further possible improvements.

The other thing we will want to enforce is actually disallowing any references to depsBuild* in the runtime closure (I can’t think of a violating example immediately, but at least the manual claims that we currently don’t enforce this.)

Atemu · July 27, 2024, 4:06pm

I can’t think of a scenario where depsHostHost would be used. You’d need some sort of binary that runs on the hostPlatform and produces code for the hostPlatform rather than the targetPlatform.

I guess in a cross compiler drv, the drvs for use in depsBuildBuild would be depsHostHost?

That’d be because I didn’t use what I intended to use. I had intended to use deps.hostTarget aka. buildInputs.

I don’t really see an alternative? You must make the distinction between these three platforms or else you couldn’t describe the build for a cross-compiler.

How do you think this would work?

Atemu · July 27, 2024, 4:17pm

I’m not deep into cmake but my understanding is that all it does it generate a Makefile and never generates code for the host platform.

Or does the ECM module make cmake generate code, necessitating a cmake that targets the hostPlatform?

emily · July 27, 2024, 4:39pm

The docs say “metaprogramming”. Which, sure, I guess? We actually have one single use, libuuid in edk2, which might just be an error.

It is a misfeature of GCC that the target is “special” and fixed at compile time. With Clang and Rust, it is standard for --target=‹whatever› to work out of the box at runtime with a single compiler build. The Plan 9 C toolchain and Go also do this in a much more reasonable way. A “cross compiler” shouldn’t be a particularly special thing, just a compiler that happens to be able to build for a given target, in the same way that any package we have might provide or not provide certain functionality. “Does this compiler support producing object code for AArch64?” is not fundamentally different from, like, “Does this tool outputting in EBCDIC?”, but we don’t think we need a depsHostCharset.

The big problem with target is that it only applies to builds of compiler tools specifically, hence the suggestion to set targetPlatform = null rather than targetPlatform = hostPlatform for non‐compiler packages in the issue I linked, so that they’re not misleadingly interchangeable in dependencies. The sad aspect of that problem is that it only truly applies to builds of badly‐designed compiler tools.

Atemu · July 27, 2024, 4:50pm

It doesn’t matter whether it’s baked into the compiler binary or provided by a runtime flag (which would also have to be baked into i.e. a wrapper), the distinction still exists. In the end of the day, a build system must be able to run cc and have it spit out the correct binaries without the build system itself needing to be aware of the cross compilation setup. Building a “cross-compiler” (regardless of whether it’s a binary or a wrapper script) is the only way to achieve that in a generic manner.

If the charset was relevant, it’d be a property of the platform definition.

It’d still be depsHostTarget, the targetPlatform would simply have charset = "EBCDIC" while the hostPlatform may or may not.

waffle8946 · July 27, 2024, 5:14pm

Oh, I thought the existence of things like cmake --build implied it was generating code for runtime as well. If it’s just wrapping make, and we’re not using that wrapper, I don’t know why we’d put cmake in nativeBuildInputs. Does it need to care about its target platform? (What we’d call “host” in the package)?

AndersonTorres · July 27, 2024, 9:32pm

KDE docs:

The Extra CMake Modules package, or ECM, adds to the modules provided by CMake, including ones used by find_package() to find common software, ones that can be used directly in CMakeLists.txt files to perform common tasks and toolchain files that must be specified on the commandline by the user.

It looks like a package that enhances cmake with more detection macros.

Enviado via Proton Mail para Android

SergeK · July 28, 2024, 9:29am

Well Find*.cmake and *Config.cmake would be picked up by find_package both from buildInputs and nativeBuildInputs because of our blanket addSearchPath CMAKE_PREFIX_PATH ... rule, and they’d work the same because find_{program,library,path} inside these paths are controlled by environment variables irrespective of how we added the modules and configs to the prefix list. What matters is whether ECM ships (hard-codes paths to) any real libraries or executables, and how are they meant to be consumed:

❯ nix build nixpkgs#extra-cmake-modules 
❯ find result/ -type f -executable                                                                        │~
result/share/ECM/kde-modules/kde-git-commit-hooks/clang-format.sh                                         │~
result/share/ECM/kde-modules/kde-git-commit-hooks/pre-commit.in
❯ ag 'bin/' result/                                                                                                          │~
...
result/share/ECM/kde-modules/kde-git-commit-hooks/clang-format.sh                                         │~
1:#!/nix/store/agkxax48k35wdmkhmmija2i2sxg8i7ny-bash-5.2p26/bin/bash                                      │~
...
❯ ag 'lib/' result/
...
❯ ag '/nix/store/' result/ 
result/nix-support/propagated-build-inputs 
1:/nix/store/ih3wsahlr3d787jc4kzqizp6syq6hy29-cmake-3.29.3 /nix/store/2hfzaqv42iwrpk3ya24cnjsklz6f68lw-pkg
-config-wrapper-0.29.2
result/share/ECM/kde-modules/kde-git-commit-hooks/clang-format.sh
...
result/share/ECM/kde-modules/kde-git-commit-hooks/pre-commit.in
...

So, from my naive viewpoint, where to put ECM is a question of what platform you wish to execute the propagated cmake/pkg-config and the pre-commit scritps on

SergeK · July 28, 2024, 9:33am

I guess the reason is precisely that its outputs are agnostic to the host/target distinction so we go with the less scary name?

AndersonTorres · July 29, 2024, 1:39pm

Let me try to understand this specific part.

Inside buildPlatform, we build (say) hugs, so that hugs will run on hostPlatform.

Inside hostPlatform, it will build (say) hcc, so that hcc will run on targetPlatform.

Is this the idea of offsets?

Atemu · July 29, 2024, 8:15pm

The idea behind target is what platform the built binary itself will produce code for. This is typically relevant when building a compiler or any other tool that produces platform-specific artifacts.

Inside of a build, we use i.e. gcc (itself built to run on the buildPlatform) to produce a hugs binary that can be executed by the hostPlatform.
The targetPlatform then decides what that binary produces code for when ran. In the typical case, targetPlatform = hostPlatform and it produces code for the same platform it runs on but you might want it to produce code for an entirely different platform (i.e. a cross-compiler).

This distinction allows you to build on e.g. an x86 build machine (buildPlatform) a compiler binary which runs on aarch64 (hostPlatform) and itself produces code for riscv64 (targetPlatform).

Relative offsets exist as a concept because you can shift your perspective. The example compiler of the previous paragraph could itself be used in another build (system = aarch64) in which case the build’s buildPlatform is the compiler’s hostPlatform and the build’s hostPlatform the compiler’s targetPlatform. If we were to build a code-generating tool again in this build, you’d have to specify the new targetPlatform too.