Brainstorming for RFC: pname and version

Pastebin with a render:

5 Likes

Reading through it, there comes one need of clarification to my mind.

When unstable branches off v1.2.0 but in the meantime, there has been v1.2.1 released and that amendment has become irrelevant on main.

Any conclusion about that case might need to be recorded in the detailed design. I donā€™t have an appropriate conclusion to propose.

1 Like

Well, upgrade the expression to v1.2.1

1 Like

Could be. But is that always correct?

Maybe a better example to showcase the presumed issue would be if v2 is a re-implementation currently going on on master, but the latest released version still is v1.x.

I think what Iā€™d want to read in a final RFC is additional thought support how to resolve such cases.

1 Like

You answered your own question: the latest released version is v1.x.

When v2 gets officially released some day in the future, it will be packaged. Until there, the version will have a format like 1.8+unstable=2021-05-20.

1 Like

Yeah, I think it has been discussed above, and iirc, there is no good solution available.

Imagine v2 has quite a different api, then having v1.x in the name will likely confuse people.

Weā€™d have to somehow make clear, that the version number is of no other semantic relevance as to indicate the last stable.

We have to somehow make sure we get into peoples minds the fact that 1.8+unstable=2021-05-20 ā€” counterintuitively ā€” has nothing to do with version 1.8.

It might but you are never allowed to conclude that from the 1.8 in the version string (alone).

Just to be clear, Iā€™m not at all against the proposal. To the contrary! :wink: And if this is going to be a trade off decision, my stance would also be that we have to live with it (unless we agreed to revise builtins.parseDrvName alongside).

1 Like

What probably can be specified is what to do if we feel compelled to package a pre-release of PostgreSQL 11.13. It clearly needs to be 11.12+unstable= regardless of whatever happens in 13.x and 14.x branches.

3 Likes

Imagine v2 has quite a different api, then having v1.x in the name will likely confuse people.

Well, I would see it as a very dangerous thing to do as a developer.

The stable release uses a certain API. It was battle-tested, therefore more stable and secure.

The v2 api is completely experimental, and completely different from v1.

IF someone is really interested in chasing this new v2 api even before its official release, then this person is already aware that the new non-released API is wildly different from v1. I do not see a reason to encode this API breakage in the package version we are using here.

It is the responsibility of the package user to know how to use it. And recognize that the master branch uses a different API than the latest official release is not our concern as package managers.

1 Like

If there is nothing as a postgresql-11.13RC1 or whatever the upstream named it, we will use postgresql-11.12+unstable=xx.xx.xx.

1 Like

If there is nothing as a postgresql-11.13RC1 or whatever the upstream named it, we will use postgresql-11.12+unstable=xx.xx.xx.

True, but 11.12 would not be the latest upstream stable release, so the rules as written would nominally forbid the obviously correct answer. Maybe add Ā«(on the corresponding stable branch, if applicable)Ā»?

2 Likes

@7c6f434c 's example is probably the best case for the concern I had. Thanks! As a trade-off decision, I also agree with @AndersonTorres 's answer to my v2 ā€œexampleā€ (since there is just no way we can know for sure it will be v2 - upstream might change their mind and we should record facts only not predicitons) :+1:

Maybe I was not fortunate when using the nomenclature.

In this document (lol), when I say ā€œstable releaseā€, I am not referring to how the upstream treats its own project. I am more interested on things like ā€œHow can I fetch this source code?ā€

As an ilustrative example, in the good old obscure times Linux versions followed a three-numbered scheme X.Y.Z; especially, odd values for Y indicated an experimental ā€œunstableā€ release while Y even was reserved for stable production-ready releases.

For us, Nixpkgs expressions writters, this particular semantics is not relevant. Or at least for the purposes of this RFC it has no relevance. We just create expressions for the useful versions, whether the upstream calls them stable or unstable. We regard all these versions as stable.

On the other hand, we can (if convenient for us) fetch the latest master branch of Linux kernel. This is what I am calling ā€œunstableā€ here.

Maybe tagged release would be a better name.

2 Likes

I think the problem is with the word ā€œlatestā€, but also ā€œtaggedā€ is probably better than ā€œstableā€. :+1:

Instead of ā€œlatestā€ , we could say ā€œlatest tagged release ā€” if a project has more than one release series: of the relevant release seriesā€ and then clarify in the examplifications with the postgres example.

1 Like

I now support this wording.

(I was not in doubt that the wording in the text was intended to achieve this result, but I think this rewording is much clearer)

Indeed I can think in an extreme case: a very simple developer environment with git as the only tool.

It is just a codebase maintained by many developers, and from time to time they make a new release.

But the ā€œmake a releaseā€ process is merely ā€œCreate a new entry in the changelog file, adding 1 to the version number and (optionally) registering the hash identifier and the date of this commit in YYYY-MM-DD formatā€.

There is no tarball being generated, only the git tree being updated and a changelog to be read by package maintainers. In this scenario, the versions can be fetched by reading the changelog, not by waiting the (non-existent) automation process generate a cute tarball.

Great stuff. Thanks for driving this.

However, it does not map very well with builtins.parseDrvName function

What do we do about that? Itā€™s not evident from your proposal.

Usually, Nixpkgs maintains ā€œunstableā€ releases of many softwares, sometimes
along with stable ones.

You put ā€œunstableā€ in quotes, yet continue using it in the proposed naming convention without clarifying its meaning. Stability is a property of the software, i.e. that it does not crash randomly, or its API, i.e. that it does not change arbitrarily subject to some rules. Having a tagged, named, or otherwise well-defined version identifier has nothing to do with that. Overloading the meaning of ā€œunstableā€ is bound to produce confusion (at least initially), and Iā€™d prefer to keep cognitive overhead low. What speaks against only adding +YYYY-MM-DD to the latest release version?

Also, what happened to +nixpkgs=YYYY-MM-DD to signify custom patches?

As an alternative I would like to add a suffix signifying the version control revision identifier, such as +git=<hash>.

The new format (hopefully) solves it automatically, because it forces version to start with a digit. But OK, it needs to be clearer.

Yes, I have noticed it while brainstorming.
Better names are needed.

The idea is to use a key=value semantics on the version attribute, as suggested by @zimbatm above.
It makes the version attribute more amenable to parsing and therefore to automation.

However we need a better name here. Maybe untagged?

Unneeded.

Patching the code in order to make it run seamlessly in Nixpkgs is an expected piece of our development. There is no need to encode it in the version.

The purpose of version attribute is the upgrade. Indeed it is explicited in nix-env manual.
Custom patches made exclusively for Nixpkgs are not an ā€œupgradeā€ to the code in this sense, no more than patchShebangs.

Indeed, conceptually patchShebangs, substituteInPlace et al. are all patches, as well as many other things we do in softwares like CMake and Meson. Patches are not limited to the diff & patch tools after all.
However, no one would suggest to encode all these custom patches in version attribute.

Therefore, there is no need to encode these custom Nixpkgs patches in version.

Git hashes are opaque strings. They convey no useful information for human beings. Dates are way more useful.

Also, as said above, version is used to upgrade. Git hashes donā€™t follow the ordenation rules stated by nix-env, whereas YYYY-MM-DD dates do.

1 Like

Ah right. I got confused, because I followed the structure of the document, and there was no transition from ā€žthe parser separates on digitā€œ to ā€žthis is how we stay compatible while making it more meaningfulā€œ, but went straight into the definition.

Agreed on keeping the key=value thing consistent. Donā€™t like untagged because it alludes to semantics that are specific to git. What about unreleased, to continue in the same vein as we had so far? Although in general I think it would be better to have a positive, i.e. unprefixed term, such as rolling (as in ā€žrolling releaseā€œ, although thatā€™s already a bit obfuscated) or snapshot (as @davidak suggested).

Alright, thanks for the clarification.

I thought the argument about the debianesque prefix for +nixpkgs= was originally about picking a specific unreleased version for packaging purposes, not just patching for packagability. Iā€™d accept the argument that making it a statement of ā€žeditorā€˜s discretionā€œ and therefore prefix it as such, and this is what I read from @ryceeā€˜s comment - correct me if Iā€™m wrong. Other than that it would not convey much meaning without further explanation.

Sure. I intended to clarify this as an alternative deliberately not considered, maybe I didnā€™t put it clearly.

Would be good to add exactly that to the reasoning in the document.

When I commented before, I felt like trying to compose multiple pieces of human-derived information into a single string is putting the cart before the horse. (Iā€™m less focused on the information in the string or how it gets laid out than on whether improving the packaging process here can make it easier to iterate towards ecosystem goals.)

I donā€™t think cart/horse is a constructive observation, so I tried to tease out some alternate frames/approaches. But as the discussion has played out, Iā€™ve started to wonder if I muddled or under-sold my point.

I want to take another swing, but please ignore me if my point was clear the first time. :slight_smile:


Previously, I focused on a process:

I probably should have given specific examples:

  • There should be 1 attr for the official upstream version identifier. There should be no opinion or deviation (even if that identifier is non-numeric). If there is no such identifier, the value should be none.

  • No vague attr names like date or version. The name should clarify where the date comes from and what the version belongs to.

  • Packagers shouldnā€™t need to make any decision/judgement/opinion call that requires systemic knowledge/perspectiveā€“this is a job for composing functions.

    • They shouldnā€™t have to decide if something is ā€œunstableā€ or not.

    • They shouldnā€™t be deciding if something like the release/commit date is the version for Nixā€™s purposes. (Even if the upstream version IS the dateā€“this isnā€™t the packagerā€™s call.)

  • Every value or detail that is included in or impacts the composition of pname/name/version strings belongs in its own attr.

  • Needing a comment to explain the valueā€™s formatā€“like a date formatā€“is a smell it should be decomposed into an attrset.

  • With as few exceptions as possible, pname/name/version strings should be programmatically composed from these values.

1 Like

Update; it is now way more confused!

I am trying to introduce the idea of multiple branches, as well as some useful terminology.

1 Like