Using Nix for ETL Pipeline

Maybe this is a crazy thought but has anyone played around with generating a nix expression for the dependency graph of an ETL pipeline?

Imagine you have 10 text files. You want to generate some entities from these files. Each entity will have some attributes and corresponding values which may take some time to generate. You first run a separate program that peeks into each file and just constructs a dependency tree without actually generating the attributes and values. So from file number 1 you get the information that you can create a total of 5 intermediate entities. You now look at file number 2 and 3 and grab the entries (their IDs) that relate to the 5 intermediate entities. You keep doing this until you’ve figured out that to build your desired final output you need to first construct a bunch of intermediate things and you now know all their inputs. Now you ask Nix to build this with its usual caching and parallel build niceties and you can trivially run this on a remote build machine or I guess an entire fleet.

The above is currently my day job except we do all of this manually in Golang. I’m not saying I want to do this in production right now but it would be a fun thing to play around with and: does anyone have experience with that or is there some other tool which would be much better suited here?

1 Like