Pre-RFC: Visually differentiate the `?` symbol in Nixpkgs

Hi,

I’d like to propose a solution to one of the “papercut” problems we have, with the ? operator.
By formatting it differently, we can nudge readers in the right direction.
I think we have good reasons to make this change, and for this reason I would kind of expect it to be an uncontroversial and a fairly quick RFC.

What do you think?

UPDATE:
I’ve overlooked the fact that ? does not have .'s precedence, which I consider to be a fatal flaw for the proposal as written.
We might look at introducing a new operator to fix this.
Something to keep in mind when reading the proposal


Summary

Visually differentiate the “has attribute” symbol from the “parameter default” symbol in Nixpkgs.

Motivation

The Nix language syntax has a ? symbol, which is used in two places in the grammar.

  • The “has attribute” operator, and
  • The symbol that precedes a default value for an attribute set destructuring binding.

This proposal focuses on the way we format the prior in Nixpkgs, for two reasons

  1. It is unfortunate that ? has two distinct meanings. Visually differentiating them helps slightly with learning and understanding.
  2. Despite syntactically being a binary operator, ? does not behave like the other binary operators, which causes confusion.

The ? operator behaves a lot like the . operator (attribute lookup or “selection”), so it would make sense to format the use of these operators the same.

So by changing the formatting convention of ?, we expect a slight improvement in learning and understanding of the Nix language.

Detailed design

Change all documentation and style guides within the NixOS projects (nix, nixpkgs, etc), to require that

  • The “has attribute” operator is formatted without any whitespace before or after the ? character.
  • The default value declaration for bindings introduced in attribute set destructuring function declarations is formatted with a single space on both sides of the ? character.

Furthermore we encourage all Nix expression formatting tools to adopt these rules. A tool that supports vertical alignment may generalize “single space” to “at least one space”, but note that this is not an endorsement of vertical formatting.

Examples and Interactions

End result

As a result of this RFC, checking for an attribute should look as follows:

f = drv:
  if drv?passthru
  then drv.passthru
  else { };

You can see the similarity with the . operator, and you would be right to assume that passthru is a literal in both cases, as opposed to a variable reference.

A equivalent function declaration using a default still looks as follows:

f = { passthru ? { } , ... }: passthru;

Interlude about . or ternary operator

Although f is based on real Nixpkgs code and it is a nicely simple example, it is worth noting that f can be further simplified to

f = drv: drv.passthru or {}

However, this solution does not generalize to all problems that involve checking whether an attribute exists, so we will ignore it for the purpose of this RFC.

Source of confusion

As discussed, a potential source of confusion is between the two meanings of ?. However, while they look the same, they can be disambiguated by their context; where they occur.

The same does not apply to the other source of confusion, as was encountered in nix#1059.

Given that almost all operators take expressions, except the “obvious” exception, ., we would expect the following to generalization to work:

  • The operator + in a + b operates in the expressions a and b.
  • The operator // in a // b operates in the expressions a and b.
  • The operator ? in a ? b operates in the expressions a and b, except it does not. b is an identifier and not an expression.

While removing the whitespace around ?, resulting in a?b does not clearly convey this fact, it may at least raise attention.

Drawbacks

Contributors to Nixpkgs will have to adapt a bit, and they may have to be reminded of the new convention.

Alternatives

Change the language

The current proposal makes for a slight improvement in readability and learning, but a more complete effect could be achieved by changing the language.

The downside of this is that it is a more invasive change, that takes much more effort to decide and implement, and involves a long wait before Nixpkgs can transition to it.

Specifically, RFC 137 Nix language versioning needs to be implemented first, and the whole community needs to upgrade to a new Nix version that implements the change. Furthermore language changes have a significant cost.

To illustrate what could be considered in an RFC that changes the syntax, we might create a less ambiguous syntax such as

f = { passthru or {}, ... }: passthru;

and

f = drv: drv.?passthru;

However, for the reasons stated above, we won’t explore such proposals in this RFC.

Always format with spaces

The current de facto rule is to always use spaces around ?.

While ignoring the semantics makes for simple rules, it does not aid understanding.

This alternative has the benefit of requiring no action.
However, if the minor nature of the improvement in this RFC leads some to believe that it is not worthwhile, I would suggest that they focus on more important things while they let the change happen.

Prior art

As of writing, I am unaware of prior discussions of this ambiguity, except for nix#1059.
? is hard to search for, so if anyone remembers a prior discussion, a link would be highly appreciated.

  • Issue nix#1059: “? operator gives wrong results”
  • The ? operator was introduced before Nix 0.5 and has been documented with spaces since.

Unresolved questions

No unresolved questions as of yet.

Future work

  • Consider changing the language to support a more consistent and unambiguous syntax. An example of this was shown for illustrative purposes in the Alternatives section.
The part with the uninteresting metadata

This goes at the top of the RFC, but isn’t rendered in a readable manner by discourse; hence the “code” formatting.

---
feature: quest_to_questionably_solve_the_question_about_question_mark
start-date: (fill me in with today's date, YYYY-MM-DD)
author: Robert Hensing
co-authors: (find a buddy later to help out with the RFC)
shepherd-team: (names, to be nominated and accepted by RFC steering committee)
shepherd-leader: (name to be appointed by RFC steering committee)
related-issues: https://github.com/NixOS/nix/issues/1059
---
6 Likes

This sounds reasonable, and introducing a formatting convention is a cheap way of raising awareness. You pointed out on multiple occasions that we can de facto deprecate certain constructs without actually changing the language, and I agree we should do that where possible.

2 Likes

I don’t like this, because it makes operator precedence look confusing. Take this expression for example:

let drv.passthru = 42; in toString drv?passthru

without spaces around ?, it is misleading because the expression actually looks like this

let drv.passthru = 42; in (toString drv)?passthru

This is not an issue with . operator, and I think that’s a fundamental differende between . and ?

nix-repl> let drv.passthru = 42; in toString drv.passthru 
"42"
8 Likes

Oh, very good point!

Obligatory parentheses would remedy that, but I’m not sure if the cost/benefit works out then.
Maybe it should be a new operator (like .? from the example) after all then, as that could fix the precedence problem.

This RFC idea is not the quick win I expected. I think an operator needs to be added instead.

5 Likes

I’m fine with that, no strong opinions

1 Like

Note that other, more popular languages like JavaScript, PHP or Dart use field/member access operator preceded by a question mark as a syntactic sugar for if objectOrNull == null then null else objectOrNull.member so using .? can lead to both syntactic and semantic confusion in people coming from those languages.

3 Likes

Yeah I didn’t want to go into that much detail yet because I thought it was irrelevant.

  • .? ambiguity with other languages.

  • ?. might still suffer from that a bit, although I like that ?.foo contains an intact .foo.

  • ?? would not suffer from that ambiguity, but perhaps reflect the reader’s emotional state :wink:

Any other ideas?

3 Likes

Yeah, I agree that this is too off-topic for the RFC itself but RFC should still suggest reasonable alternatives and pre-RFC should be good venue to bring the issues with that up.

I do not really like either ?. or .? because I can never recall which is the correct one, whenever I use them in JS or PHP. But of these two, using .? does make more sense, IMO, since it is checked access, rather than check-then-access in those other languages.

?? does not suffer from the minor syntactic confusion, but still does from the semantic one – nullOrValue ?? fallbackValue is a syntactic sugar for something like if nullOrValue != null then nullOrValue else fallbackValue in JavaScript, PHP and C#.

One alternative that might work better could be in operator from Python and JavaScript. But it might require the checked key to be quoted to reduce parsing ambiguity.

1 Like

in

I do think it needs to be one or two symbols and not a keyword, to make it appear similar to ., as that was at least half of the motivation here.
That doesn’t leave many good options then.
Wild idea: a.-b, with rationale: ignore / remove (-) the value and what’s left is a boolean, did the attr exist?

I might have to let go of this RFC idea.

If you manage to find a solution that relies purely on formatting, I don’t think writing an RFC for that would be necessary. It could be integrated into RFC 101 instead.


But let’s be honest, the Nix syntax is a mess here and we won’t be able to dig ourselves out without any major changes to the language. I’ve thought about this problem too a while ago, and came to the proposal of copying Kotlin’s operators:

  • foo ?: bar for if foo != null then foo bar else bar
    • This would replace and supersede the or operator (it is more powerful though because it can be used outside of attribute accesses)
    • This would be used for function default arguments instead of ?
  • foo.?bar for if foo.bar != null then foo.bar else null

The existing ? then can remain for membership checks.

1 Like

My goal is mostly to make the language easier to understand and learn, so I was really looking to replace existing syntax (or formatting) rather than add more operators. We should be really careful about operators because many readers don’t use the language all that often.

I’m dropping any “has attribute” changes, but I think adding or for attribute binding defaults could still be an improvement:

{ a or null }: a

It’d be equivalent to

{ a ? null }: a

We could then phase out usage of ? in parameters.

What do you think?

Okay, if you want to go with minimal changes to the language this would be a first step I’d be in favor of. Additionally, I’d like to see or be usable outside of attribute access (so being able to write foo or bar instead of foo.bar or baz).

The problem with or is that IIRC it isn’t really a keyword and one can get into trouble by defining functions called or. But that may be a separate discussion.

1 Like

Yeah it’s like a half-keyword; not in a binding or attribute context, but I don’t think it causes too much trouble.

nix-repl> or = throw "or!"

nix-repl> {}.a or "A"
"A"

nix-repl> or
error: syntax error, unexpected OR_KW

       at «string»:1:1:

            1| or
             | ^

nix-repl> { or = "ok"; }
{ or = "ok"; }

It’s just not easy to use when it’s in scope, so you’ll want to treat it like a proper keyword in practice.

nix-repl> let or = "hi"; in { inherit or; }.or
"hi"

Most importantly for the syntax improvement, there won’t be an ambiguity in the grammar.

nix-repl> { a or b }: a
error: syntax error, unexpected OR_KW, expecting '.' or '='

       at «string»:1:5:

            1| { a or b }: a
             |     ^

Maybe we can deprecate ? in favor of a hasAttr :: AttrSet -> String -> Bool function in builtins?

2 Likes

The argument order is flipped, but that is indeed an alternative.
We shouldn’t make Nix warn about ? because we’ll want to keep evaluating existing expressions without a deluge of warnings.
As for soft deprecation without warning, but perhaps a lint check in Nixpkgs CI, I’m not eager to “remove” ? without a suitably concise alternative.

There’s also an efficiency argument to be made. The interpreter is driven by the AST, so an operator is a bit simpler to evaluate than a primop call. I’m not making that argument without measurement though :wink:

1 Like

I was actually suggesting a new function, didn’t check if we had one already :sweat_smile:

https://nixos.org/manual/nix/stable/language/builtins.html#builtins-hasAttr

1 Like

If we do want to deprecate this syntax, we can add an experimental feature to disable the syntax, similar to no-url-literals, and just add this feature to the ofborg commands

2 Likes