Meta attribute for binary packages

Exactly, this could be done today if someone wants to go a big PR to update all the relevant meta attributes.
It’s also become a metric that we could track over time.

Are Java JAR files “binaries”? For example, Tomcat is open source, but building it from source is pain in the ass (maybe not, don’t know)

It may be decompiled, because it isn’t obfuscated. Which license should this be?

IMO a JAR file is not a plain-text file, and thus should be treated as binary :slight_smile: That said, I think most people wouldn’t be bothered by having pre-built binaries on their systems, so until someone finds the motivation to change the tomcat derivation to build from source it could very well stay a binary derivation.

Which makes me think: how do we want to actually handle dependencies that are source-only but not checked?

To explain: buildRustPackages downloads all the dependencies in a fixed-output derivation. However, as it downloads all dependencies at once, it means the hash is something specific to this particular application. As a consequence, even though all the code is built from source, it is very unlikely that someone will review this exact set of dependencies and give a review result for the cargoSha256 hash.

Do you think that is an issue we can safely ignore, or would it be better to set meta.binary for the cargoSha256 and similar fixed-output derivations, that are not directly tarballs/repositories distributed from upstream?

I like that, usually its better to “just do it” and later improve than to never start anything because it may not be perfect :slight_smile:
That said I don’t know of many binary packages and won’t actively search them out for now. Maybe we could add them gradually as we notice binary packages.

As for the name, I think isBinary would be fine. But I’m not married to that.

Yes, I’d say they are. Also decompilation loses a lot of metadata.

I think the license / binary questions are unrelated.

Hm, I don’t know much about the rust infrastructure. Ideally, we wouldn’t do that but instead specify the dependencies like with any other nix package. There’s probably good reasons why we do it like that though.

Since all the source is still available, I wouldn’t classify it as binary. But its not pretty either :smiley:

1 Like

Fun fact: I proposed exactly such a change in the early version of [WIP] stdenv: inherit `src.meta` to allow moving most of `meta` into `src.meta` by oxij · Pull Request #35075 · NixOS/nixpkgs · GitHub

78b4941  stdenv/check-meta, doc: add and document `binary` meta attr
6371134  stdenv/check-meta, doc: add and document `allowBinary`

but that was decided controversial (the PR in question is very related, btw, since you generally want to tag src as binary, and have the derivation inherit it automagically unless explicitly overridden; I haven’t reached that utopia yet, though).

With the experience since then, I’d say the following works fine:

  • add binary meta attr,
  • mark a bunch binary packages as such (just search calls to patchelf),
  • add allowBinary, set it to true by default,
  • add permittedBinaryPackages,
  • in paranoid configuration.nixes set allowBinary to false, add bootstrapTools to permittedBinaryPackages, get happy.
2 Likes

Thanks, that looks very relevant. I agree that adding it in the same PR may have been out of scope. Do you plan to move that PR forward? If so, it would probably be best to wait for that to avoid adding a bunch of meta attributes just to later move the m to the src.

If there’s bytecode for a virtual machine, I would certainly consider that binary.

What is original intent then? If it’s “be sure source code is not hidden” then unobfuscated JAR files do provide that.

JAR decompilers (same for C#) produce nice code, and can wrap it back to “binary”.

So, I’d treat (unobfuscated) JAR files as archives or archives, and if we unpack archive of archives, and then

pack it back - would it count as building from source? IMO - yes, but then why do that unpack/pack?

Situation for many other languages is different - it is often impossible to do decompilation. That would count as

“binary”. Also, if license explicitly forbids decompilation, that would be also “binary”, even if it is JAR. Also,

when obfuscated, this also means “binary”.

If original intent is “purification”, then I agree JARs may be considered “impure”, “binary”, “not built by our machines”.

Hmm, discussing the precise border probably isn’t the main point of this thread. In some respects it’s apparently even harder to define than I anticipated.

I’m surprised that Java’s decompiled code is considered sufficiently readable, but I have basically no knowledge about that. I often wouldn’t be able to easily understand my own C code without the extensive comments I put there.

Do you plan to move that PR forward?

I do, but that issue is hard (it needs a large treewide change to nixpkgs to resolve, or a clever hack I didn’t invent yet) and of fairly low impact at the same time, so that’s a low priority. So, realistically, maybe in two years.

If so, it would probably be best to wait for that to avoid adding a bunch of meta attributes just to later move the m to the src.

Well, it’s not like it would be hard to move those attributes from the derivation to the src sometime later.

IMO the biggest stdenv problem ATM is with check-meta.nix, that one needs to be generalized and simplified ASAP, which is a high priority. I expect binary patches to be reborn after that one is finished.

I would prefer if such a feature would not slow down nix evaluation as I don’t see it is super useful as you can still hide malware in large dependency graphs (see npm).

Aside from specifically a meta.binary attribute, I would like to see a generic way in Nixpkgs to pass in a function that computes whether a derivation is permitted or not. Certain organizations need to keep track of the software they use, and check various properties. Such a generic way could be used to insert this type of tracking and validating.

5 Likes

Hmm, perhaps. It’s true that so far we’ve only been approaching it over the particular aspects (license, insecure, …): Nixpkgs 23.11 manual | Nix & NixOS Adding such a generic hook should be relatively easy and cheap; using it is probably harder for the simple use cases, so I expect it wouldn’t be replacing (all) the related options we have already.

Maybe the allowUnfree etc. could be helpers for easily setting the generic hook? This would solve all problems at once, and potentially even simplify code at the end of the day. Something like having the generic hook default to (pseudo-code) pkg: (isFree pkg || allowUnfree pkg) && (isNotBroken pkg || allowBroken pkg)

That’s basically the current code, only the default isn’t that simple at all, because there are quite a lot of conditions already, we want to print nice descriptions, and there are also some checks that serve a slightly different purpose (e.g. type checks), etc.

Hmm… would just making this function overridable (well, the part that checks for unfree-ness, insecure-ness, broken-ness and not-blacklist-ness, at least, maybe not the type-checking stuff) somehow solve the problem? It’d mean that when it’s overridden the previous helpers are no longer available, but that doesn’t really sound like a big deal, as it’d be for rarely-used cases anyway.