Exactly, this could be done today if someone wants to go a big PR to update all the relevant meta attributes.
It’s also become a metric that we could track over time.
Are Java JAR files “binaries”? For example, Tomcat is open source, but building it from source is pain in the ass (maybe not, don’t know)
It may be decompiled, because it isn’t obfuscated. Which license should this be?
IMO a JAR file is not a plain-text file, and thus should be treated as binary That said, I think most people wouldn’t be bothered by having pre-built binaries on their systems, so until someone finds the motivation to change the tomcat derivation to build from source it could very well stay a binary derivation.
Which makes me think: how do we want to actually handle dependencies that are source-only but not checked?
To explain: buildRustPackages
downloads all the dependencies in a fixed-output derivation. However, as it downloads all dependencies at once, it means the hash is something specific to this particular application. As a consequence, even though all the code is built from source, it is very unlikely that someone will review this exact set of dependencies and give a review result for the cargoSha256
hash.
Do you think that is an issue we can safely ignore, or would it be better to set meta.binary
for the cargoSha256
and similar fixed-output derivations, that are not directly tarballs/repositories distributed from upstream?
I like that, usually its better to “just do it” and later improve than to never start anything because it may not be perfect
That said I don’t know of many binary packages and won’t actively search them out for now. Maybe we could add them gradually as we notice binary packages.
As for the name, I think isBinary
would be fine. But I’m not married to that.
Yes, I’d say they are. Also decompilation loses a lot of metadata.
I think the license / binary questions are unrelated.
Hm, I don’t know much about the rust infrastructure. Ideally, we wouldn’t do that but instead specify the dependencies like with any other nix package. There’s probably good reasons why we do it like that though.
Since all the source is still available, I wouldn’t classify it as binary. But its not pretty either
Fun fact: I proposed exactly such a change in the early version of [WIP] stdenv: inherit `src.meta` to allow moving most of `meta` into `src.meta` by oxij · Pull Request #35075 · NixOS/nixpkgs · GitHub
78b4941 stdenv/check-meta, doc: add and document `binary` meta attr
6371134 stdenv/check-meta, doc: add and document `allowBinary`
but that was decided controversial (the PR in question is very related, btw, since you generally want to tag src
as binary
, and have the derivation inherit it automagically unless explicitly overridden; I haven’t reached that utopia yet, though).
With the experience since then, I’d say the following works fine:
- add
binary
meta attr, - mark a bunch binary packages as such (just search calls to
patchelf
), - add
allowBinary
, set it totrue
by default, - add
permittedBinaryPackages
, - in paranoid
configuration.nix
es setallowBinary
to false, add bootstrapTools topermittedBinaryPackages
, get happy.
Thanks, that looks very relevant. I agree that adding it in the same PR may have been out of scope. Do you plan to move that PR forward? If so, it would probably be best to wait for that to avoid adding a bunch of meta attributes just to later move the m to the src
.
If there’s bytecode for a virtual machine, I would certainly consider that binary.
What is original intent then? If it’s “be sure source code is not hidden” then unobfuscated JAR files do provide that.
JAR decompilers (same for C#) produce nice code, and can wrap it back to “binary”.
So, I’d treat (unobfuscated) JAR files as archives or archives, and if we unpack archive of archives, and then
pack it back - would it count as building from source? IMO - yes, but then why do that unpack/pack?
Situation for many other languages is different - it is often impossible to do decompilation. That would count as
“binary”. Also, if license explicitly forbids decompilation, that would be also “binary”, even if it is JAR. Also,
when obfuscated, this also means “binary”.
If original intent is “purification”, then I agree JARs may be considered “impure”, “binary”, “not built by our machines”.
Hmm, discussing the precise border probably isn’t the main point of this thread. In some respects it’s apparently even harder to define than I anticipated.
I’m surprised that Java’s decompiled code is considered sufficiently readable, but I have basically no knowledge about that. I often wouldn’t be able to easily understand my own C code without the extensive comments I put there.
Do you plan to move that PR forward?
I do, but that issue is hard (it needs a large treewide change to nixpkgs to resolve, or a clever hack I didn’t invent yet) and of fairly low impact at the same time, so that’s a low priority. So, realistically, maybe in two years.
If so, it would probably be best to wait for that to avoid adding a bunch of meta attributes just to later move the m to the
src
.
Well, it’s not like it would be hard to move those attributes from the derivation to the src
sometime later.
IMO the biggest stdenv
problem ATM is with check-meta.nix
, that one needs to be generalized and simplified ASAP, which is a high priority. I expect binary
patches to be reborn after that one is finished.
I would prefer if such a feature would not slow down nix evaluation as I don’t see it is super useful as you can still hide malware in large dependency graphs (see npm).
Aside from specifically a meta.binary
attribute, I would like to see a generic way in Nixpkgs to pass in a function that computes whether a derivation is permitted or not. Certain organizations need to keep track of the software they use, and check various properties. Such a generic way could be used to insert this type of tracking and validating.
Hmm, perhaps. It’s true that so far we’ve only been approaching it over the particular aspects (license, insecure, …): Nixpkgs 23.11 manual | Nix & NixOS Adding such a generic hook should be relatively easy and cheap; using it is probably harder for the simple use cases, so I expect it wouldn’t be replacing (all) the related options we have already.
Maybe the allowUnfree
etc. could be helpers for easily setting the generic hook? This would solve all problems at once, and potentially even simplify code at the end of the day. Something like having the generic hook default to (pseudo-code) pkg: (isFree pkg || allowUnfree pkg) && (isNotBroken pkg || allowBroken pkg)
That’s basically the current code, only the default isn’t that simple at all, because there are quite a lot of conditions already, we want to print nice descriptions, and there are also some checks that serve a slightly different purpose (e.g. type checks), etc.
Hmm… would just making this function overridable (well, the part that checks for unfree-ness, insecure-ness, broken-ness and not-blacklist-ness, at least, maybe not the type-checking stuff) somehow solve the problem? It’d mean that when it’s overridden the previous helpers are no longer available, but that doesn’t really sound like a big deal, as it’d be for rarely-used cases anyway.