In the discussions for RFC 193, @matklad and me independently had the idea of formalizing a “data only” subset of nix, similar to zon (zig object notation) files in the zig ecosystem (see their comment and mine).
This format would:
- Be 100% valid Nix code, but
- guarantee linear evaluation complexity by
- disallowing the definition and usage of functions and most other language features
This could be very useful for defining the “data only” portion of a Flake (which is exactly what RFC 193 is about) but also generally to replace JSON and TOML in the nix ecosystem, like in flake.lock , lockfiles of npins or niv, or profile manifests, which are already stored as a non-formal .nix when using nix-env, but .json when using nix profile. It could even replace aterms as the serialization format for derivations!
This file format needs a new file extension (I liked the suggestion .nox (Nix Object eXpression) and the evaluator then needs to raise errors if any disallowed language constructs are used within these files. They could still be imported without a problem from regular .nix files, of course.
New builtin functions toNOX and fromNOX would also be provided.
Some questions that remain:
- What should the name of the file format be? (
.nox,.non,.nxn,.nixon) - What syntax should actually be allowed in this format?
The second questions seems much harder to answer. A minimal subset is easy to think about:
{
a = “a”;
# Comments are important
b = { c = null; d = 3; };
e.f.g = true; # Are we sure this is a good idea though?
}
Attrsets, strings, int, bool and null all must be supported. Comments most likely as well.
Nested attribute names (e.f.g in the example above, maybe there’s a better technical term) already aren’t quite as clear. They are useful to reduce indentation and make some things easier to write, but they also make serialization ambiguous and make it harder to write tooling that can work with this format. However, this hasn’t kept TOML from exploding in popularity, so maybe not much of an issue.
But then it becomes less clear; what about floats? let ... in? import of other files of the same format? Every new feature makes this format more useful, but of course, it also makes it much harder to properly serialize, deserialize and modify programmatically.
So, what are your thoughts on this? I am feeling kinda motivated to write an RFC, just for having the format itself. Whether we put it in all the different places I mentioned above is a completely separate story.