Pre-RFC: pipe operator `|>`

Summary

Introduce a new “pipe” operator, |>, to the Nix language, defined as f a = a |> f.
Additionally, elevate lib.pipe to a built-in function.

As a reminder, pipe a [ f g h ] is defined as h (g (f a)).

Motivation

Creating advanced data processing like transforming a list is a thing commonly done in nixpkgs. Yet the language has no support for function concatentation/composition, which results in such constructs looking unwieldy and difficult to format well. lib.pipe may be the most powerful library function with that regard, but it is unknown and overlooked by many because it is not easily discoberable: Despite its great usefulness, it is currently used in less than 30 files in Nixpkgs (rg '[\. ]pipe .* \['). Additionally, it is not accessible to Nix code outside of nixpkgs, and due to Nix’s lazy evaluation debugging type errors is really difficult.

Let’s have a look at an arbitrarily chosen snippet of Nixpkgs code:

defaultPrefsFile = pkgs.writeText "nixos-default-prefs.js" (lib.concatStringsSep "\n" (lib.mapAttrsToList (key: value: ''
  // ${value.reason}
  pref("${key}", ${builtins.toJSON value.value});
'') defaultPrefs));

It is arguably pretty hard to read and reason about. Even when applying some more whitespace-generous formatting:

defaultPrefsFile = pkgs.writeText "nixos-default-prefs.js" (
  lib.concatStringsSep "\n" (
    lib.mapAttrsToList
    (
      key: value: ''
        // ${value.reason}
        pref("${key}", ${builtins.toJSON value.value});
      ''
    )
    defaultPrefs
  )
);

One can observe the following issues:

  • If you want to follow the data flow, you must read it from bottom to top, from the inside to the outside (the input here is defaultPrefs).
  • Adding a function call to the output would require wrapping the entire expression in parentheses and increasing its indentation.

Compare this to the equivalent call with lib.pipe:

defaultPrefsFile = pipe defaultPrefs [
  (lib.mapAttrsToList (
    key: value: ''
      // ${value.reason}
      pref("${key}", ${builtins.toJSON value.value});
    ''
  ))
  (lib.concatStringsSep "\n")
  (pkgs.writeText "nixos-default-prefs.js")
];

The code now clearly reads from top to bottom in the order the data is processed, it is easy to add and remove processing steps at any point.

With a dedicated pipe operator, it would look like this:

defaultPrefsFile = defaultPrefs
  |> lib.mapAttrsToList (
    key: value: ''
      // ${value.reason}
      pref("${key}", ${builtins.toJSON value.value});
    ''
  )
  |> lib.concatStringsSep "\n"
  |> pkgs.writeText "nixos-default-prefs.js";

The artificial distinction between the first input and the functions via the list now is gone, and so are the parentheses around the functions. With the lower syntax overhead, using the operator becomes attractive in more situations, whereas a pipe pays for its overhead only in more complex scenarios (usually three functions or more). Having a dedicated operator also increases visibility and discoverability of the feature.

Detailed design

|> operator

A new operator |> is introduced into the Nix language. Semantically, it is defined as the reverse of function application: f a = a |> f. It is left-associative and has a binding strength one weaker than function application: a |> f |> g b |> h = h ((g b) (f a)).

builtins.pipe

lib.pipe’s functionality is implemented as a built-in function. The main motivation for this is that it allows to give better error messages like line numbers when some part of the pipeline fails. Additionally, it allows easy usage outside of Nixpkgs and increases discoverability.

While Nixpkgs is bounds to minimum Nix versions and thus |> won’t be available until several years after its initial implementation, it can directly benefit from builtins.pipe and its better error diagnostic by overriding lib.pipe. Elevating a Nixpkgs library function to a builtin has been done several times before, for example bitAnd, splitVersion and concatStringsSep.

Examples and Interactions

Tooling support

Like any language extension, this will require the available Nix tooling to be updated. Updating parsers should be pretty easy, as the syntax changes to the language are fairly minimal. Tooling that evaluates Nix code in some way or does static code analysis should be easy to support too, since one may treat the operator as syntactic sugar for function application. No fundamentally new semantics are introduced to the language.

Prior art

Nickel has |> too, with the same name and semantics.

F# has |>, called “pipe-forward” operator, with the same semantics. Additionally, it also has “pipe-backward” <| and <</>> for forwards and backwards function composition. <| is equivalent to function application, however its lower binding order allows removing parentheses:
g (f a) = g <| f a

Elm has the same operators as F#.

Haskell has the (backwards) function composition operator .: (g . f) a = g (f a).

|> is definable as an infix function in several other programming languages, and in even more languages as macro or higher-order function (including Nix, that’s lib.pipe).

Alternatives

For each change this RFC proposes, there is always the trivial alternative of not doing it. See #drawbacks.

More operators

We could use the occasion and introduce more operators like those in F#.

Function composition is mostly interesting for the so-called “point-free” programming style, where partially applied compositions of functions are preferred over the introduction of lambda terms. However, Nix is not well suited for that programming style for various reasons, nor would that point-free style have many applications in real-world code: Nix ist mostly used to apply functions, not to define them.

The reverse-pipe operator has a lot less use, and also adding it would raise a lot of question about its interaction with forward-pipe, like associativity and operator precedence. Those are certainly resolvable, but it remains that |> and <| interact in unintuitive ways. Programming languages that do have it recommend sticking to one direction without mixing them for that reason. Therefore, it is likely not worth the complexity cost.

Change the pipe function signature

There are many equivalent ways to declare this function, instead of just using the current design. For example, one could flip its arguments to allow a partially-applied point-free style (see above). One could also make this a single-argument function so that it only takes the list as argument.

Drawbacks

  • Introducing |> has the drawback of adding complexity to the language, and it will break older tooling.
  • The main purpose of builtins.pipe is as a stop-gap until Nixpkgs can use |>. After that, it will be mostly redundant.

Unresolved questions

  • Who is going to implement this in Nix?
  • How difficult will the implementation be?
  • Will this affect evaluation performance in some way?
    • There is reason to expect that replacing lib.pipe with a builtin will reduce its overhead, and that the builtin should have little to no overhead compared to regular function application.

Future work

Once introduced and usable in Nixpkgs, existing code may benefit from being migrated to using these features. Automatically transforming nested function calls into pipelines is unlikely, as doing so is not guaranteed to always be a subjective improvement to the code. It might be possible to write a lint which detects opportunities for piping, for example in nixpkgs-hammering. On the other hand, the migration from pipe to |> should be a straightforward transformation on the syntax tree.

33 Likes

Definitely on board. That’s one thing I have missed from other functional languages for sure.

3 Likes

Gone live (with minor modifications): https://github.com/NixOS/rfcs/pull/148

3 Likes

This would be great to have. I had written a small function that achieves something similar to avoid ((parenthesis(hell())), but changing the direction did not occur to me.

1 Like

Oh wow, this is devious. I actually thought about hacking something like this too, but did not know how to make variadic functions in Nix. Your solution has the obvious downside that it will horribly break when higher order functions are involved …

My idea involved having a function pipe and a special function pipeEnd, passing the latter as argument then would serve as recursion anchor *

About the direction, as the RFC discusses it does not matter that much in the end.


* Edit: I did it:

pipeEnd = { __magic = "computer crimes"; };
pipe = a: f: if isAttrs f && f ? __magic && f.__magic == "computer crimes" then a else pipe (f a);
3 Likes

Like other ML-based functional languages, Haskell has a couple commonly used function composition operators:

(.) :: (b -> c) -> (a -> b) -> a -> c
g . f = \a -> g (f a)

($) :: (a -> b) -> a -> b
f $ a = f a

(&) :: a -> (a -> b) -> b
a & f = f a

(>>>) :: (a -> b) -> (b -> c) -> a -> c
f >>> g = \a -> g (f a)

All of these operators are pretty widely used within the Haskell community.

In Nix, I really miss not having these operators available.

I hope this RFC makes it though, although it would be nicer if you also got some sort of function composition operator included as well!


edit: Oh, I see you already brought most of this up in the RFC!

Function composition is mostly interesting for the so-called “point-free” programming style, where partially applied compositions of functions are preferred over the introduction of lambda terms. However, Nix is not well suited for that programming style for various reasons, nor would that point-free style have nearly as many applications in real-world Nixpkgs code.

Hmm, I wonder what brings you to this conclusion? When I’m writing Nix code, I’m constantly frustrated by not having (.) or (>>>) available. I really feel like it is a constant pain whenever I’m writing any sort of substantial Nix code.

(I think there is an argument that point-free code can be confusing, but that is much more of a problem in Haskell where you can define arbitrary operators.)

4 Likes

Yeah, I got some feedback about that part and forgot to copy over the changes back here. Sorry for the confusion.

Interestingly, I never really felt the need for function application in the same way I do for argument piping. Do you have some examples where using function application would be clearly beneficial to the code and better than some alternative operator?

I’m not opposed to proposing the addition of more operators in Nix. I just think that picking just one of them has the best complexity-expressiveness tradeoff.

I’m honestly quite surprised at this. I could dig through my Nix code and find a bunch of examples for you, but I’m just surprised you don’t feel the same.

I get frustrated almost every time I try to use map. In Nix currently, it is not possible to write map (f . g) foolist, but I instead have to write map (x: f (g x)) foolist.

Or even just a function definition:

let doThing = f . g in map doThing foolist

When I’m writing Haskell, normally I stay away from too much point-free style, but a function composition operator is really nice, especially when what you’re doing is essentially just function composition.

Sure, you could write the two above examples like doThing = x: x |> f |> g, but the whole x: x |> part sort of distracts from what’s going on.

That’s a good point, and you may be entirely correct here. I’d of course rather see only |> added, instead of neither |> or . added.

Oh, I understand. The general problem is that you need to map over some data twice. I agree that this is a very commonly encountered scenario in Nixpkgs. The difference is that I’d use two map operators and be frustrated over the needless nesting, while you’d use one map and be frustrated over the lack of function composition.

So in your example, when using |> instead of doing map (x: x|> f |> g) list I’d have thought of list |> map f |> map g.


I have the idea of picking a random representative set of files from Nixpkgs and refactoring with the use of various such operators. It would be interesting to have some data on how much potential each operator truly has in real-world code.

4 Likes

Ah, that makes sense. That’s an interesting observation.

I also agree that list |> map f |> map g is nicer than map (x: g (f x)) list, and probably nicer than map (x: x|> f |> g) list. Although map (g . f) list still seems slightly simpler to me.

Although based on the current push-back from Robert in https://github.com/NixOS/rfcs/pull/148, I’m guessing there are quite a few people that feel like |> is too big of an addition, let alone both |> and a separate function composition operator. I’d be pretty happy if you were able to get even one of these operators into the Nix language.

2 Likes

Personally I am relatively indifferent whether we get Elixir/shell/ML style piping or Haskell style function composition (+ ($)!), as long as we eventually get one of them.

Currently all we have is lisp style parenthesis or using an exhaustive amount of let-bindings to maintain readability, at the cost of having to give them meaningful names…

And sad but true, even though I know in theory about lib.pipe, I rarely remember that it exists when I need it, and at the same time, I think an actual operator is easier to read than the function call.

7 Likes

I might be biased as an Elixir dev, but to me there shouldn’t be two thoughts about the usefulness of the pipe operator. Wish they’d add it to Rust as well. :crossed_fingers:

3 Likes

I agree that it’s more terse, but I would argue that keeping the nix language small, readable and thus learnable is important for wide-spread adoption.

Piping seems to be a common concept not just in functional but imperative languages and shell scripting as well, so I believe the barrier for understanding it is much lower.

I would add that many shells, including bash (which is what the majority of build steps are written in) have the same concept with the same semantics and precedence, just a different sigil (| vs the proposed |>).

Julia also has |> and it works pretty much the same.

One thing you seem to be ignoring is that lazy, functional, lambda-calculus-based languages are a different beast from other types of languages, including imperative languages, shell scripting languages, etc.

What makes sense for Haskell and Nix may not make sense for other languages, and vice-versa. Just because bash and Julia don’t have a function composition operator, it doesn’t mean it is a difficult concept to learn, or that it doesn’t make sense for Nix.

Although like I said above, given how contentious I imagine the RFC will be, not pushing for a function composition operator is probably the best bet to get something into the language.

1 Like

Having just come back from a deep-dive into Haskell again, I’d like to put forth that the addition of syntactical complexity can both increase the potential for illegible code, and offer ways to simplify and increase readability of code that would otherwise become illegible.

Proper composition operators add to what you have to know to read a section of code, but lacking them increases the complexity floor and produces code like lib.customization.makeScopeWithSplicing.

Even the simplest language can result in completely incomprehensible monstrosities; taking it to an extreme, we can look at lambda calculus, which has 2 operators and 1 type, but likely 0 legible, non-trivial programs ever written.

TL;DR: Coming to a new language, I’d prefer to find it has tools I don’t need to use rather than complexity from their absence.

2 Likes

I tend to make one liners excessively verbose just so I can indent per-bracket to make code more readable and make sure I have all my };); matching.

Something worth noting is that imperative languages do have something like: |>, that lets you chain functions together.

In any C-like language you have ‘;’ which allows you to create an order of operations if you manage state imperatively

{ Thing(); AnotherThing(); }

If your language is OO:

object.Thing().AnotherThing(); 

In nix the later is possible in some instances (overrideAttrs)

Functional programming, especially pure ones, tend to favor using the syntax tree itself establish a “chain” of operations, this demands heavy use of brackets, which is an issue for imperative languages in a pure style too.

So personally I always find the “imperative kids won’t understand” argument a bit weird.

My meta suggestion is that nix could use a language extension system like Haskell, opting in to a experimental feature on a per-file basis is better than the paralysis that comes with needing to get it right on first attempt or remaining backwards compatible (tho I guess nix isn’t meant to be “constantly evolving” like Haskell).

4 Likes

I would add version comparison operators together with it (say .<., .<=., .>=. and .>.)

The problem with current lib.versionOlder and lib.versionAtLeast is their arguments are reversed and hard to use in functional style. Their counter-intuitivity results in reimplementation in e.g. Python and Ruby subsystems (all those pythonOlder, atLeast32, …)

Introducing |> makes things even worse:
One could write |> lib.hasPrefix "boo" and get what expected
But |> lib.versionAtLeast "2.2" would work in the opposite way

1 Like

Ouch, I did not know about that. Sounds to me like fixing these library functions would be the solution, though I guess a breaking change like that would be difficult to implement. More probably, new versions of these functions with different names would need to be added and the old versions deprecated.

Version comparison operators are a whole different story, definitely out-of-scope for this RFC. Would be interesting if they exist in any other language.

3 Likes

I don’t really understand how this affects a potential pipe operator to be honest. It is not meant to be applicable for any use case, and while |> lib.hasPrefix does the intuitively correct thing I’d argue against using it like that. (But if you really want to, ("1.1" |> lib.versionAtLeast) "2.2" should work as well. Again, not recommended.)

1 Like

I just thought on the broader context on which infix operators will be added next (especially if Nix code will look more like Haskell/Ocaml) so we could arrange their priorities even before adding the first such an operator (>>= is also on the table, but it is trivial)

1 Like