Brainstorming for RFC: pname and version

AndersonTorres · May 30, 2021, 4:45am

Pastebin with a render:

blaggacao · May 30, 2021, 6:12pm

Reading through it, there comes one need of clarification to my mind.

When unstable branches off v1.2.0 but in the meantime, there has been v1.2.1 released and that amendment has become irrelevant on main.

Any conclusion about that case might need to be recorded in the detailed design. I don’t have an appropriate conclusion to propose.

AndersonTorres · May 30, 2021, 7:34pm

Well, upgrade the expression to v1.2.1

blaggacao · May 30, 2021, 9:36pm

Could be. But is that always correct?

Maybe a better example to showcase the presumed issue would be if v2 is a re-implementation currently going on on master, but the latest released version still is v1.x.

I think what I’d want to read in a final RFC is additional thought support how to resolve such cases.

AndersonTorres · May 30, 2021, 10:17pm

You answered your own question: the latest released version is v1.x.

When v2 gets officially released some day in the future, it will be packaged. Until there, the version will have a format like 1.8+unstable=2021-05-20.

blaggacao · May 30, 2021, 10:42pm

Yeah, I think it has been discussed above, and iirc, there is no good solution available.

Imagine v2 has quite a different api, then having v1.x in the name will likely confuse people.

We’d have to somehow make clear, that the version number is of no other semantic relevance as to indicate the last stable.

We have to somehow make sure we get into peoples minds the fact that 1.8+unstable=2021-05-20 — counterintuitively — has nothing to do with version 1.8.

It might but you are never allowed to conclude that from the 1.8 in the version string (alone).

Just to be clear, I’m not at all against the proposal. To the contrary! And if this is going to be a trade off decision, my stance would also be that we have to live with it (unless we agreed to revise builtins.parseDrvName alongside).

7c6f434c · May 30, 2021, 10:54pm

What probably can be specified is what to do if we feel compelled to package a pre-release of PostgreSQL 11.13. It clearly needs to be 11.12+unstable= regardless of whatever happens in 13.x and 14.x branches.

AndersonTorres · May 31, 2021, 12:38am

Imagine v2 has quite a different api, then having v1.x in the name will likely confuse people.

Well, I would see it as a very dangerous thing to do as a developer.

The stable release uses a certain API. It was battle-tested, therefore more stable and secure.

The v2 api is completely experimental, and completely different from v1.

IF someone is really interested in chasing this new v2 api even before its official release, then this person is already aware that the new non-released API is wildly different from v1. I do not see a reason to encode this API breakage in the package version we are using here.

It is the responsibility of the package user to know how to use it. And recognize that the master branch uses a different API than the latest official release is not our concern as package managers.

AndersonTorres · May 31, 2021, 12:40am

If there is nothing as a postgresql-11.13RC1 or whatever the upstream named it, we will use postgresql-11.12+unstable=xx.xx.xx.

7c6f434c · May 31, 2021, 12:56am

7c6f434c:

if we feel compelled to package a pre-release of PostgreSQL 11.13.

If there is nothing as a postgresql-11.13RC1 or whatever the upstream named it, we will use postgresql-11.12+unstable=xx.xx.xx.

True, but 11.12 would not be the latest upstream stable release, so the rules as written would nominally forbid the obviously correct answer. Maybe add «(on the corresponding stable branch, if applicable)»?

blaggacao · May 31, 2021, 1:28am

@7c6f434c 's example is probably the best case for the concern I had. Thanks! As a trade-off decision, I also agree with @AndersonTorres 's answer to my v2 “example” (since there is just no way we can know for sure it will be v2 - upstream might change their mind and we should record facts only not predicitons)

AndersonTorres · May 31, 2021, 2:07am

Maybe I was not fortunate when using the nomenclature.

In this document (lol), when I say “stable release”, I am not referring to how the upstream treats its own project. I am more interested on things like “How can I fetch this source code?”

As an ilustrative example, in the good old obscure times Linux versions followed a three-numbered scheme X.Y.Z; especially, odd values for Y indicated an experimental “unstable” release while Y even was reserved for stable production-ready releases.

For us, Nixpkgs expressions writters, this particular semantics is not relevant. Or at least for the purposes of this RFC it has no relevance. We just create expressions for the useful versions, whether the upstream calls them stable or unstable. We regard all these versions as stable.

On the other hand, we can (if convenient for us) fetch the latest master branch of Linux kernel. This is what I am calling “unstable” here.

Maybe tagged release would be a better name.

blaggacao · May 31, 2021, 3:21am

I think the problem is with the word “latest”, but also “tagged” is probably better than “stable”.

Instead of “latest” , we could say “latest tagged release — if a project has more than one release series: of the relevant release series” and then clarify in the examplifications with the postgres example.

7c6f434c · May 31, 2021, 6:42am

I now support this wording.

(I was not in doubt that the wording in the text was intended to achieve this result, but I think this rewording is much clearer)

AndersonTorres · May 31, 2021, 11:52am

Indeed I can think in an extreme case: a very simple developer environment with git as the only tool.

It is just a codebase maintained by many developers, and from time to time they make a new release.

But the “make a release” process is merely “Create a new entry in the changelog file, adding 1 to the version number and (optionally) registering the hash identifier and the date of this commit in YYYY-MM-DD format”.

There is no tarball being generated, only the git tree being updated and a changelog to be read by package maintainers. In this scenario, the versions can be fetched by reading the changelog, not by waiting the (non-existent) automation process generate a cute tarball.

fricklerhandwerk · May 31, 2021, 2:01pm

Great stuff. Thanks for driving this.

However, it does not map very well with builtins.parseDrvName function

What do we do about that? It’s not evident from your proposal.

Usually, Nixpkgs maintains “unstable” releases of many softwares, sometimes
along with stable ones.

You put “unstable” in quotes, yet continue using it in the proposed naming convention without clarifying its meaning. Stability is a property of the software, i.e. that it does not crash randomly, or its API, i.e. that it does not change arbitrarily subject to some rules. Having a tagged, named, or otherwise well-defined version identifier has nothing to do with that. Overloading the meaning of “unstable” is bound to produce confusion (at least initially), and I’d prefer to keep cognitive overhead low. What speaks against only adding +YYYY-MM-DD to the latest release version?

Also, what happened to +nixpkgs=YYYY-MM-DD to signify custom patches?

As an alternative I would like to add a suffix signifying the version control revision identifier, such as +git=<hash>.

AndersonTorres · May 31, 2021, 4:50pm

The new format (hopefully) solves it automatically, because it forces version to start with a digit. But OK, it needs to be clearer.

Yes, I have noticed it while brainstorming.
Better names are needed.

The idea is to use a key=value semantics on the version attribute, as suggested by @zimbatm above.
It makes the version attribute more amenable to parsing and therefore to automation.

However we need a better name here. Maybe untagged?

Unneeded.

Patching the code in order to make it run seamlessly in Nixpkgs is an expected piece of our development. There is no need to encode it in the version.

The purpose of version attribute is the upgrade. Indeed it is explicited in nix-env manual.
Custom patches made exclusively for Nixpkgs are not an “upgrade” to the code in this sense, no more than patchShebangs.

Indeed, conceptually patchShebangs, substituteInPlace et al. are all patches, as well as many other things we do in softwares like CMake and Meson. Patches are not limited to the diff & patch tools after all.
However, no one would suggest to encode all these custom patches in version attribute.

Therefore, there is no need to encode these custom Nixpkgs patches in version.

Git hashes are opaque strings. They convey no useful information for human beings. Dates are way more useful.

Also, as said above, version is used to upgrade. Git hashes don’t follow the ordenation rules stated by nix-env, whereas YYYY-MM-DD dates do.

fricklerhandwerk · May 31, 2021, 7:34pm

Ah right. I got confused, because I followed the structure of the document, and there was no transition from „the parser separates on digit“ to „this is how we stay compatible while making it more meaningful“, but went straight into the definition.

Agreed on keeping the key=value thing consistent. Don’t like untagged because it alludes to semantics that are specific to git. What about unreleased, to continue in the same vein as we had so far? Although in general I think it would be better to have a positive, i.e. unprefixed term, such as rolling (as in „rolling release“, although that’s already a bit obfuscated) or snapshot (as @davidak suggested).

Alright, thanks for the clarification.

I thought the argument about the debianesque prefix for +nixpkgs= was originally about picking a specific unreleased version for packaging purposes, not just patching for packagability. I’d accept the argument that making it a statement of „editor‘s discretion“ and therefore prefix it as such, and this is what I read from @rycee‘s comment - correct me if I’m wrong. Other than that it would not convey much meaning without further explanation.

Sure. I intended to clarify this as an alternative deliberately not considered, maybe I didn’t put it clearly.

Would be good to add exactly that to the reasoning in the document.

abathur · May 31, 2021, 11:37pm

When I commented before, I felt like trying to compose multiple pieces of human-derived information into a single string is putting the cart before the horse. (I’m less focused on the information in the string or how it gets laid out than on whether improving the packaging process here can make it easier to iterate towards ecosystem goals.)

I don’t think cart/horse is a constructive observation, so I tried to tease out some alternate frames/approaches. But as the discussion has played out, I’ve started to wonder if I muddled or under-sold my point.

I want to take another swing, but please ignore me if my point was clear the first time.

Previously, I focused on a process:

I probably should have given specific examples:

There should be 1 attr for the official upstream version identifier. There should be no opinion or deviation (even if that identifier is non-numeric). If there is no such identifier, the value should be none.
No vague attr names like date or version. The name should clarify where the date comes from and what the version belongs to.
Packagers shouldn’t need to make any decision/judgement/opinion call that requires systemic knowledge/perspective–this is a job for composing functions.
- They shouldn’t have to decide if something is “unstable” or not.
- They shouldn’t be deciding if something like the release/commit date is the version for Nix’s purposes. (Even if the upstream version IS the date–this isn’t the packager’s call.)
Every value or detail that is included in or impacts the composition of pname/name/version strings belongs in its own attr.
Needing a comment to explain the value’s format–like a date format–is a smell it should be decomposed into an attrset.
With as few exceptions as possible, pname/name/version strings should be programmatically composed from these values.

AndersonTorres · May 31, 2021, 11:59pm

Update; it is now way more confused!

I am trying to introduce the idea of multiple branches, as well as some useful terminology.