Unknown values propagation in Nix like in HCL

dzmitry-lahoda · March 26, 2023, 8:51pm

HCL is configuration language used by OpenTofu(formerly Terraform)
OpenTofu is DSC to configure cloud resources graph (example, instance of VM and static IP for it, VM requires IP to exists before instanced)
some resources can depend on known data (examples, TF_VAR variables, cloud credentials, constants, local files). known at evaluation time.
some resources can depend on unknown data (example, one resource evaluates some expression depending data obtained after creation resources it depends on, so VM depends on static IP).
there is injection of unknown-become-known data from cloud via plan function. so basically for already created resources (for which there is up to date state), values became from unknown to known.
There various limitation of using languages constructs on unknown data References to Values - Configuration Language | Terraform | HashiCorp Developer.
One of these, if to apply A, apply B, then C. Destroy ABC and apply ABC all at once - it may fail because C relied on know values of A, but now it became unknown (so need to apply AB and then apply C).
Basically there are 2 languages in one. one evaluated at compile time and other one ``late binding dynamic at runtime```, they overlap greatly in syntax so.
Unknown values become known without code modification, after state obtained and refreshed by plan.

So I tried to use terranix and compare it with terraform with HCL and NixOps Release 2.0 old tracking issue · Issue #1242 · NixOS/nixops · GitHub

From existing limitation, unknown values are unclear if can be expressed in Nix, expressed before in NixOps, or such type system can be built in Nix.

Any ideas on this?

Basically that may be blocker in adoption of Nix as cloud configuration language (I am thinking on sandwich nix → terraform HCL → nix, not nix → terranix with HCL expression strings of that second unknown values language → nix as I use now).

pxc · March 26, 2023, 9:31pm

@yannham and @Gabriel439 might have something insightful to say here, since they both maintain somewhat Nix-like languages that are used or intended to be used in conjunction with Terraform.

I think this might be the kind of thing that @offlinehacker was talking about when he said (IIRC) he gave up Kubenix in favor of Pulumi because it’s not easy to handle ‘dynamic’ stuff with Nix.

(Apologies if I’ve tagged anyone in error!)

arianvp · March 27, 2023, 8:42am

I like how Cue handles this:

You can have expressions with unknown values (types) and evaluate over and over again with external input until you hit a fixed point. This works because values form a lattice.

NixOS modules are Lattice-y too especially when you disable type checking when evaluating the module you could do a similar trick.

yannham · March 27, 2023, 9:29am

Indeed, we have faced the same issue handling computed value in Terraform from Nickel in Terraform-Nickel. @vkleen worked this out (to some extent - I don’t think we can do computations on computed values yet), so he definitely has more insights to share than I have (yes, I’m summoned and I summon someone else ).

On a related note, similar ideas have been emerging for an effect system in Nickel, exactly for representing those kinds of only-known-at-deployment-time-values (using a computed value would be an effect). Computing with effects would amount to compute as much as possible but keeping expressions involving such unknown value as symbolic/partial. Concretely, instead of having a final string expression, you would have an AST of the computation (with nodes evaluated as much as possible: you won’t carry over e.g. "foo" ++ "bar", that you can already evaluatd as "foobar", but you would have _COMPUTED_(resource.id) ++ "bar").

Doing that without special language support e.g. in Nix is technically possible, but probably painful. I think your conclusion that there are two languages is spot-on, and you would have to use a specific API/DSL for handling computed values and in particular know beforehand which values are computed or expressions containing a computed value, in order to use the right operation, which is going to be a pain without a type system. Or use an API/DSL that is able to handle both dynamically, thus departing form normal Nix (including all basic operations like string interpolation, arithmetic, etc.). This DSL would produce ASTs encoded in Nix, and you would then have to translate it to Terraform (if I recall correctly, the JSON syntax of Terraform actually handles the “late binding dynamic at runtime” language inside strings as well). Well, at least without thinking too hard about it - there might be a better and clever encoding, but the fact is, the Nix language is quite simple and straightforward, which doesn’t let a lot of features as candidates to be abused for doing clever things…

yannham · March 27, 2023, 9:44am

I’m curious, do you know if this is actually used for the specific use-case of e.g. Terraform? Indeed CUE can handle partially defined value (and as you mention, so do NixOS modules, to some extent).

However, from what I understand, when interacting with an external tool like Terraform, you have to first evaluate your Nix expression and then feed this result to TF in two distinct phases. Unless you have some more evolved framework that can interleave the execution of TF itself with Nix evaluation, you need to produce everything upfront - including computations on yet-unkown-value (computed values). Thus, you would have to translate - or rather, transpile - partially defined value (say as a NixOS module) to the fragment of HCL (or TF JSON) that correspond to operations on computed values, which doesn’t sound easy to do at all in pure Nix code. That being said, one can infamously redefine basic primops in user code in Nix, so maybe this is a trick to explore here?

I think this might be the kind of thing that @offlinehacker was talking about when he said (IIRC) he gave up Kubenix in favor of Pulumi because it’s not easy to handle ‘dynamic’ stuff with Nix.

Same conclusion here as well, by the way: Pulumi supports this by design, which makes it a much simpler target for any alternative cloud language candidate like Nix.

dzmitry-lahoda · March 27, 2023, 10:58am

One dumb way is to use terraform provisioner(so I can feed some terraform resource to nix expression and get back some data) to call nix (anyway I will use it because I need to run nixos-rebuild on remote machine with terraform output).

Second is transpiler, I use terranix. Terranix is dumb is that it just writes NIX to JSON file. That it.
In theory it can traverse Nix expresision and see some modules attributes mkUnknown. As it finds them , it parses Nix code inside, and generates Hcl functions and fails compilation if cannot transpile or more static resoureces are not defined (so it will be COMPUTED(resource.id)).

dzmitry-lahoda · March 27, 2023, 11:04am

So unknown values abuse has its own issues. At least I did it:)

So

Do some HCL and terraform apply it to cloud.
Add more stuff to HCL, terraform apply
Do terraform destroy
Run terraform apply
Expected: it works
Actual: Some new code was build on premise of some values are known, but they are back to unknown.
Solution: They call it terragrunt. You basically make several layers of HCL. And wire each layer using TF_VARs, direct reference to state of previous state or reading data directly from cloud.

So it is 1. terraform apply layer A. 2. terraform apply B 3. etc (I did 5 layers).
From this practice Pulumi may be non reproducible nightmare so

I am doing in nix like that similar somewhat:

Nix build base cloud image
Feed nix output to terraform
Feed output of terraform into nix
Feed nix output to terraform
etc…

So making unknown values visible in type system could be good overall. At least gradual making of unknown usage (Pulumi/Hcl/Cue) to more static(Nix) could be awesome. So startup can do dirty work, and then refactor to release production.

dzmitry-lahoda · November 23, 2023, 7:47pm

I ended up here and this way

Before I tried nix + cloud 3 times (nixops, terranix, terraform hacks with instantiate approach, and my approaches).

This one appears does best + suggest improvements + super flex + no hacks at all.

All terraform modules I saw which make module with nixos are bearish tbh. Because cloud is always custom, no one TF module fits all.

So article suggests how to sandwich nix and tf, and trigger one other. And i have couple of my tricks - making change detection and state modification easy.

I suppose it is not like nix vs terraform in general. But more of what is computation model of that thing. I understand it, but hell how to formalize to allow to express just in nix?

Including sensitive keys, lazy fixed nix, lazy remote diff state of tf, manual interaction for fix, immutable ssh nix for manual interaction, nix modules, using nix hashes as input to terraform VAR to force trigger in terraform to rerun provisioner, tf plan-dry-run.

Some unified model.

So here is end to end run with CI and secret integration