The Uncompromising Nix Code Formatter

18 Likes

Hmm, is there a comparison between this and nixfmt? Is there a particular reason this one was designed, rather than using nixfmt - just the processing speed?

I designed Alejandra because I wanted a formatter to tell me how every piece of my code should look like, no exceptions

Speed, getting semantics perfectly correct, etc,
that is just what a good formatter should have,
this is not negotiable

This community is getting big
and we still do not have something like that.
Other communities like Rust, Go, Terraform, Python, they do have one.

There is clearly a problem to solve

Now, here are some technical facts though,
(I don’t care about beauty/style arguments)
to answer your question:

  1. Nixpkgs-fmt has bugs: Issues · nix-community/nixpkgs-fmt · GitHub

  2. Nixfmt has bugs: Escape Characters · Issue #81 · serokell/nixfmt · GitHub

  3. Alejandra was able to address certain bugs which I found in other formatters

See why:

  1. Format nixpkgs with nixfmt, run nixpkgs-review,
    and you’ll discover that many packages on Nixpkgs no longer build

    error: undefined variable 'PYTHON_EXECUTABLE'
    
        at /home/kamadorueda/.cache/nixpkgs-review/rev-f557203a0468aad924fcfd6ef3216d6497125df5-dirty/nixpkgs/pkgs/development/compilers/nextpnr/default.nix:57:22:
    
            56|     substituteInPlace ./ice40/CMakeLists.txt \
            57|       --replace '''${PYTHON_EXECUTABLE}' '${icestorm.pythonInterp}'
              |                      ^
            58|   '';
    
  2. Format Nixpkgs with nixpkgs-fmt, run nixpkgs-review,
    and you’ll discover that 37210 (almost all) packages on Nixpkgs needs
    rebuilding

Do you want a formatter breaking your builds?
I’m sure you don’t

Bugs multiplied by the 2.3 millions lines of Nix code that Nixpkgs have,
adds up a lot

Now, format Nixpkgs with Alejandra, you’ll get only 89 rebuilds!!!
all of them justified. No bugs.

Other facts:

  1. Nixpkgs-fmt is not extensive:

    It only formats a few elements on the grammar
    and leave others unformatted.

    Not formatting everything is a limitation of an architecture based on rules
    and not AST rebuilding:
    https://github.com/nix-community/nixpkgs-fmt/blob/5ae8532b82eb040ca6b21ae2d02d9e88f604e76a/src/rules.rs#L23

    Some people see “not formatting” as a feature of a formatter,
    I just simply can’t, that’s not the definition of a formatter.

    A formatter formats, right?

  2. Last release of nixfmt: Feb 10, 2020 (2 years ago)

  3. Where are the tests of nixfmt? Add a test suite · Issue #13 · serokell/nixfmt · GitHub

  4. It’s been more than ten minutes on a 16 cores machine,
    and nixfmt is still formatting Nixpkgs

In short, the important differences:

  • Correctness (bugs, broken builds)
  • Totality (a formatter must format everything)
  • Speed (dev time is expensive)

Style is subjective and I’ve been working lately on getting Alejandra into a format that is similar to Nixpkgs, in other words “beautiful” to the 51%: Issues · kamadorueda/alejandra · GitHub

13 Likes

Ah, awesome! I hadn’t realized it wasn’t doing a proper AST rebuild. That’s a bit sad; it seems nixpkgs-format is gearing up to become the default formatter for nixpkgs. Nice to see an alternative approach here that results in more thorough formatting, though I guess it may step on too many toes for upstream use.

Though I will say that

Alejandra has no bugs:

is a very bold statement. I’m sure we can tickle out a few :wink:

This is a nice overview, thank you very much for writing this, as well as the formatter.

4 Likes

First let me say that I appreciate your effort to create a formatter that tries to not change semantics of the code. Perhaps you can make it work to not require a single rebuild? ~90 rebuilds is still a lot for a formatter that claims to not touch the semantics.

I applied your formatter to my system config and have to say, there are some flaws:

  1. It removes a lot of empty lines and packs stuff together. I can see how it makes no difference for the machine, but I want my grouping to be preserved. I you do not want to preserve the groups, you can sort attributes by alphabet as well, it won’t change semantics anyway…

  2. Argsets are broken into multiple lines. Thats annyoing if there are only 2 or 3 items, which easily fit a single line. And putting the full capture/at-pattern on just another new line makes it just a bit uglier… (Though I like that you use leading comma!)

  3. In my opinion there is no need to put parens and contents on separate lines (the list is actually much longer but I changed the diff for brevity):

    -      exclude = (map (e: "%h/${e}") [ ".cache" ]);
    +      exclude =
    +        (
    +          map (e: "%h/${e}") [
    +            ".cache"
    +          ]
    +        );
    
  4. let … in [ … ] seems to indent the list when it gets placed on a new line, while let … in { … } places the sets opening and closing braces at the same indentation as the in. This feels inconsistent. What I said about lists, seems to be true for arbitrary expressions but set literals.

  5. foo = { bar = ""; }; gets pulled into a single line, rather than leaving it on 3. This introduces diff noise which I originally wanted to avoid by writing it the way I did when adding more attributes.

  6. Why is the following split into 2 lines, while a lot of other constructs are getting pulled into a single line?

    -        packageRequires = ep: with ep.melpaStablePackages; [ agda2-mode eri annotation ];
    +        packageRequires =
    +          ep: with ep.melpaStablePackages; [ agda2-mode eri annotation ];
    
  7. Within a multiline string, the following happens:

           extraInit = ''
             (setenv "PATH"
                     (concat "${pkgs.leiningen}/bin:" (getenv "PATH")))
    -        (setq lsp-clojure-server-command '("${pkgs.bash}/bin/bash" "-c" "${pkgs.clojure-lsp}/bin/clojure-lsp"))
    +        (setq lsp-clojure-server-command '("${pkgs.bash}/bin/bash" "-c" "${
    +        pkgs.clojure-lsp
    +      }/bin/clojure-lsp"))
           '';
    

    I could probably live with it, if the 2 “new” lines would be indented another 2 spaces each, though would still consider no additional line breaks at all in multiline strings, as even though both are equivalent for nix, it makes it hard for a human to recognize that these 3 lines are actually one when “rendered”.

  8. Here the overall level of indention just gets to big in my opinion, and I am also back and forth about relative position of function arg and body…

        programs.emacs.extraPackages =
          ep:
          [
            (
              ep.erlang.overrideAttrs (
                oa: { buildInputs = oa.buildInputs ++ [ pkgs.perl pkgs.ncurses ]; }
              )
            )
          ];
    

From the diff there is not much I’d agree with the formatter, and the removed empty lines, the forced multi-line argsets (especially if not at the beginning of a file, but eg. flakes output function) and the multiline string behaviour are stoppers for me. I could get used to everything else I think.

10 Likes

First let me say that I appreciate your effort to create a formatter that tries to not change semantics of the code. Perhaps you can make it work to not require a single rebuild? ~90 rebuilds is still a lot for a formatter that claims to not touch the semantics.

A good amount of the rebuilds are caused by things like this: chrootenv: filter default.nix from src by tomberek · Pull Request #157760 · NixOS/nixpkgs · GitHub (reviews appreciated!) where a derivation is altered because the nix code is included in the source. I’d consider that to be an upstream bug, not the formatter changing semantics.

For specific style issues; those can be fixed and iterated on by modifying the rules, please file issues for them and we can iterate our way toward rules that are amenable as often as possible.

8 Likes

I’ve seen that some of the things I mentioned here do already have corresponding issues open, though I was not yes able to create issues for the things mentioned here that don’t have an issue, as I’m still in the office and want to create the issues later today when I have a bit more time.

2 Likes

Here we go :slight_smile: style discussion. With appreciation of your taste, mine differs, I really like the way alejandra works. No style is right or wrong though, as long as the semantics are not touched.

I’d rather see bad than incorrect or none/mixed formatting.

Do you have any idea on how stable the resulting code is from a git perspective, e.g. how many additional lines change when one adds elements to lists or sets, compared to nixpkgs-format?

I agree, taste isn’t negotiable :smiley:

And I agree that some of my points are indeed just that… My personal sense of aesthetics, and I could probably live with most of them.

But the removed newlines are indeed already filed as a bugissue, some function head issues are also. So it is not just me.

And tjey even mention in the readme, style is not yet set in stone, so lets discuss it, as long as its still possible.

A comment in the RFC talks about that: [RFC 0101] Nix formatting by happysalada · Pull Request #101 · NixOS/rfcs · GitHub

At the moment – and as far as I can tell – alejandra could be called a semantic code pretty printer, or AST renderer, or “uncompromising”—you get the point. While I don’t dislike that approach, I see that there are valid use cases where the format the code was previously written with shall be taken into account. (If only to allow skipping some sections from the formatting.)

Without going into details about examples about why this is needed—how does alejandra deal with this at the moment?

2 Likes

We have access to the original AST, and so the information is there, so some things like:

Are perfectly possible to implement

2 Likes

Are cases like @NobbZ’ example of letting foo = { bar = ""; }; and

foo = {
  bar = "";
};

stay the way they were because either way is fine possible too?

2 Likes

Just wanted to let you know that Alejandra now keeps at most 1 newline on attr-sets, so the human can decide how to group items:

Peek 2022-01-20 09-27

10 Likes

In the latest version of Alejandra, adding a newline to { b = 1; \n }; expands it to multiline, removing it collapses it, which is exactly what you want

4 Likes

Lists and let-ins allow at most 1 newline as well :rocket:

Peek 2022-01-20 09-27

8 Likes

We are now down to 47 rebuilds:
https://buildkite.com/kamadorueda/alejandra/builds/33#2f992509-7df6-4fa6-b3f3-c6b95f63536a/14037-14411

5 Likes

After today’s changes, the CLI argument --max-width and therefore its default value of 80 columns was removed

With this, users can enter a new line where they decide the code should be formatted as vertical, and by removing new-lines users can also create lines (almost) as long as they wish

Everything keeps being uncompromising in that every element has a predefined style (either vertical or horizontal), and at the same time we let the human decide one or two things about readability. From our experiments, this make nixpkgs diffs look nicer, and working aided by the tool more pleasant!

7 Likes

I have released an editor integration with Visual Studio Code:

On Nixpkgs as well: alejandra: 2022-02-10 -> 2022-02-12, add vscode extension by kamadorueda · Pull Request #159705 · NixOS/nixpkgs · GitHub

I’ll be happy if some courageous beta testers want to give it a try!

3 Likes

Drop your code or fetch a random file, and make it beautiful automagically!

Try it on your browser: Alejandra đź’…

7 Likes