I think you are actually experiencing two separate issues. I hope I can point those out below and illuminate what is going on behind the scenes.
First of all, I have to agree with you that Nix breaks referential transparency in a sense, even without involving attribute set internal merging. This has to do with a syntax rewrite based optimization Nix does, namely that a dynamic identifier that just holds a string literal is rewritten to a literal identifier. This happens at parse time:
> nix-instantiate --parse -E '{ x = 2; }'
{ x = 2; }
> nix-instantiate --parse -E '{ ${"x"} = 2; }'
{ x = 2; }
Consequently, this optimization ignores statically known names that are not attainable by a simple AST rewrite, since they’d require a scope lookup:
> nix-instantiate --parse -E 'let name = "x"; in { ${name} = 2; }'
(let name = "x"; in { "${name}" = 2; })
Since it happens at parse time, it of course doesn’t compute values as well:
> nix-instantiate --parse -E '{ ${"x" + ""} = 2; }'
{ "${("x" + "")}" = 2; }
This would be okay if dynamic and non dynamic attribute name identifiers behaved the same at run time. This is not the case, since dynamic attributes are subject to different merging behavior (I’ll discuss this in more detail below).
We can illustrate the problem by replacing the literal value in the following expression with a computed value one of the same value:
{
foo.x = 1;
${"foo"} = { y = 2; }; }
}
# Evaluates to
# => { foo = { x = 1; y = 2; }; }
(Note that the merged attribute set is already available at parse time, i.e. nix-instantiate --eval --strict
and nix-instantiate --parse
give the same output.)
Now consider:
{
foo.x = 1;
${"f" + "oo"} = { y = 2; };
}
# Crashes with
# => error: dynamic attribute 'foo' at (string):3:3 already defined at (string):2:3
Since the optimization doesn’t kick in, the attribute name is not static here and behaves differently.
The fix for this is quite simple: Treat ${…}
(dynamic attribute names) always as dynamic just like is done for "${…}"
(string interpolation) where this optimization doesn’t exist:
> nix-instantiate --parse -E '{ "${"x"}" = 2; }'
{ "${("x")}" = 2; }
As it turns out, there is already an issue for this open although no one noticed the issue of referential intransparency before (to my knowledge).
Now for internal attribute set merging which is a bit peculiar. The need for this exists because of the attribute path syntax that is arguably necessary to make writing NixOS configurations bearable:
{
services.openssh.enable = true;
services.pipewire = {
enable = true;
pulse.enable = true;
};
}
==
{
services = {
openssh = { enable = true; };
pipewire = {
enable = true;
pulse = { enable = true; };
};
};
}
This merge happens at parse time (which was surprising to me). The main merging logic is implemented in the addAttr
function in the parser which constructs an attribute set expression representation for the evaluator. I’ll try to summarize the merging logic below, but it will be a simplification: attribute sets are a quite complicated topic, since their implementation is intertwined with rec { … }
attribute sets (that also have an accompanying scope) and let … in …
bindings.
One key thing about the merging logic is that only statically inferrable merges are done, i.e. attribute path syntax is just syntactic sugar. This simplifies the implementation significantly—attribute sets are complicated enough as it stands— and allows it to be implemented more efficiently. (I’m sure there are also cases where dynamic merging would be confusing, although I don’t know any off the top of my head.) It does violate the symmetry between dynamic and static attribute names, though, as we’ll see.
When parsing an attribute set expression to create an ExprAttrs
structure, we can see treat it as a list of bindings that are of one of the following forms:
-
inherit attr
or inherit (from) attr
: These are inserted if attr
isn’t already bound in the ExprAttrs
structure we are building. If they are already present, parsing fails. This is simplified tremendoulsy by the fact that dynamic attribute names are disallowed in inherit
. (Merging { inherit (foo) bar; bar = { x = 1; }; }
isn’t possible since we can’t tell statically if foo
is an attribute set and what keys it contains.)
-
attr = value
or attr.path = value
(with any number of attribute path parts) are subject to the merging logic.
ExprAttrs
contains two separate lookup tables for bindings:
-
attrs
, mapping statically known single attribute names (not paths) to an expression representing their eventual value.
-
dynamicAttrs
, a list of expressions representing attribute names coupled with an expression representing their eventual repsective value.
That it always maps from single attribute name (or expression) to value expression, means, for starters, that the parsers needs to synthesize ExprAttrs
structs for attribute paths: E.g. { foo.bar.baz = 1; }
can only be represented by three nested ExprAttrs
structs that correspond to { foo = { bar = { baz = 1; }; }; }
(this is why nix-instantiate --parse
doesn’t print the original expression in the former case, but in the latter). This is the reason why attribute sets also get merged (like in your original example)—the parser would not be able to distinguish between explicitly created ExprAttrs
and synthesized ones from attribute paths.
There is no special code for this synthesization, in fact this is just addAttr
at work which works like this:
It looks at the head of the attribute path:
-
If it is a dynamic entry, it is pushed onto dynamicAttrs
. If necessary, a new attribute set is created and populated with the remaining attribute path. Finally, the attribute value expression is inserted where appropriate. No merging is done across these and merging of static attribute paths can go only up to the first dynamic attribute name (if any). (Note that this description doesn’t correspond to the implementation, but, since it is actually buggy, I thought I’d only describe it conceptually.)
> nix-instantiate --parse -E '{ ${"fo"+"o"}.bar = 3; ${"fo"+"o"}.baz = 3; ${"fo"+ "o"} = 3; }'
{ "${("fo" + "o")}" = { bar = 3; }; "${("fo" + "o")}" = { baz = 3; }; "${("fo" + "o")}" = 3; }
-
For static entries, it looks at the current ExprAttrs
. If no matching one exists, it is inserted into attrs
and merging of the rest of the attrpath continues as normal, creating empty ExprAttrs
as necessary (this is how nested ExprAttrs
are synthesized in the example above). If a matching entry already exists, there are two possibilities:
- It is not an
ExprAttrs
, parsing fails due to duplicate attribute definitions.
- If it is
ExprAttrs
, the attribute path with the entry we just looked split off is merged into that ExprAttrs
.
The algorithm is notably expressed without recursion, but can probably best be understood in terms of recursion.
This means, that after parsing, ExprAttrs
contains an expression representation of the eventual attribute set structure as far as statically inferrable (across their respective attrs
fields) with all attribute path lists eliminated. Dynamic attributes are inserted in the most inwards ExprAttrs
’ dynamicAttrs
as is possible to know statically.
{
foo.bar = 1;
foo.baz = { jdf = 2; };
${"ab" + "cd"}.rr = 3;
foo.${"ba" +"z"}.ghf = 4;
}
==
# after parsing
{
foo = { bar = 1; baz = { jdf = 2; }; "${("ba" + "z")}" = { ghf = 4; }; };
"${("ab" + "cd")}" = { rr = 3; };
}
How do the dynamic attributes end up in the attribute set value, though? This happens at evaluation time (necessarily) and happens in the following steps:
- First the attribute set is constructed according to
attrs
(not recursively though, since Nix is lazy!).
- Scoping related things happen for
rec { … }
sets (and the obscure __overrides
feature is handled). Dynamic attributes are ignored in the rec { … }
scope!
- Dynamic attributes are added to the attribute set. Notably, they are not merged into the attribute set. If an attribute with the same name as the dynamic one already exists in the attribute set, evaluation fails regardless of the attribute being another attribute set or not. We have already seen this in our earlier investigation of referential transparency.
Now for the merging bug you’ve experienced: All of these should evaluate to the attribute set { x = { y = 3; z = 2; }
, but the third element in the list does not:
[
{ x.y = 3; x.${"z" + ""} = 2; }
{ x.${"z" + ""} = 2; x.y = 3; }
{ x = { y = 3; }; x = { ${"z" + ""} = 2; }; }
{ x = { ${"z" + ""} = 2; }; x = { y = 3; }; }
]
Thanks to nix-instantiate --parse
we can further determine that this seems to be some sort of parser bug, perhaps in addAttr
:
# > nix-instantiate --parse tmp.nix
[ ({ x = { y = 3; "${("z" + "")}" = 2; }; }) ({ x = { y = 3; "${("z" + "")}" = 2; }; }) ({ x = { y = 3; }; }) ({ x = { y = 3; "${("z" + "")}" = 2; }; }) ]
Notice that the third element becomes ({ x = { y = 3; }; })
after parsing.