Lib.strings.trim discards string context: Bug or not?

I ran into a big foot-gun. I’m not sure if this is intentional and documented anywhere, or if it’s a genuine bug.

It seems that lib.strings.trim somehow discards the string context of its input. This has the effect that store paths interpolated into a trimmed string in some derivation do not become part of the derivation’s closure.

(For a concrete example, you might be inclined to use this to copy resource files into the store when creating configuration files using environment.etc."some/path".text. Because of this quirk, the resource file will be copied into the store upon evaluation, but it won’t be included in the system closure.)

Here’s a minimal working example.

I’ve tested it both on the current NixOS 26.05 beta (Nix 2.34.7, Nixpkgs rev 705e9929918b43bd7b715dc0a878ac870449bb03), and on NixOS 25.11 (Nix 2.31.5, Nixpkgs rev 25f538306313eae3927264466c70d7001dcea1df).

(Running nix repl using the 25.11 version of the tool and with the 25.11 Flake here:)

$ nix run github:nixos/nixpkgs/nixos-25.11#nix repl github:nixos/nixpkgs/nixos-25.11
Nix 2.31.5
Type :? for help.
Loading installable 'github:nixos/nixpkgs/nixos-25.11#'...
Added 7 variables.
checks, devShells, formatter, htmlDocs, legacyPackages, lib, nixosModules
nix-repl> pkgs=legacyPackages.x86_64-linux                       

nix-repl> builtins.getContext "${pkgs.hello}"                    
{
  "/nix/store/8550ijfw8imvqnky30gwdvwlidq3g42b-hello-2.12.2.drv" = { ... };
}

nix-repl> builtins.getContext (lib.strings.trim "${pkgs.hello}")
{ }

Since I can’t find this behavior documented anywhere and the source of trim doesn’t appear to be using any obvious unsafeDiscardStringContext calls or similar, this sure does feel like it could be a bug. But I wanted to ask before creating a report.

Does anyone have some idea of what could be causing this? If this is expected: Are there other classes of string functions that can cause the context to be discarded, that you should look out for?

I don’t know if it’s expected vs a bug, but I think it’s related to the use of builtins.match to do trimWtih using a regex.

From a general standpoint, not sure if there’s a clean way for a regex to know which bits of context to keep/discard.

From a specific standpoint, not sure if trim should explicitly add the string context back in since you know you are only discarding whitespece.

1 Like

Okay, so apparently builtins.match is the real culprit (quite surprising, since I would expect the builtins to be most careful with this sort of internal bookkeeping): https://github.com/NixOS/nix/issues/2547

But this also means it’s a bit of a larger issue: Every function that uses match (and split, as mentioned in that link!) has the same foot-gun behavior. The issue above mentions that you could use appendContext to restore the deleted context. But evidently not all Nixpkgs library functions do this (let alone any derivations that may use the builtin directly). If nothing else, maybe the project should add a library wrapper that does this, and add a check to ensure nothing is using the raw match function.

Heh, we came to the same conclusion at the same time :slight_smile:

True! Though you could err on the side of caution and always return the original closure. It could end up larger than actually needed in the final derivation. But technically the origin of that context, i.e. an interpolated store path, is part of the input to the match function. So it may affect what matches are returned, so shouldn’t it be considered as such anyway? I might be conflating the role of derivation inputs vs store paths to be included in the closure, though. Maybe this makes no sense.

Regardless, in the rare cases where you would end up with a massively bloated closure and you really need to throw most of it away, that’s what unsafeDiscardStringContext is for, isn’t it? The point being that the discarding is made explicit.

I don’t understand the C code at all, but the person who created the above issue mentions that it looks like the code intends to handle the context gracefully, but then doesn’t. :person_shrugging:

EDIT: Also, the documentation mentions that functions can explicitly check that their input has no context if they are such that they really can’t do anything sensible with it. Again, this is meant to force their callers to make a decision to throw it away. If regex matching and splitting, in the general case, can’t deal with context, they should implement this check, IMHO.


Here is my workaround for the time being. Instead of using trim directly, define:

trim' = s: lib.addContextFrom s (lib.trim s)

But as mentioned, this affects potentially more functions, which will have to be wrapped on a case by case basis.

Sure, if you want a genericised version, then just make the function lib.trim into an arg:

wrapWithContext = f: s: lib.addContextFrom s (f s)

As for the original question, the context should probably be preserved for __match, since __substring 0 0 preserves it (as demonstrated by addContextFrom itself) :slight_smile:

2 Likes

Indeed :slight_smile: Though what I meant was that you will have to figure out when to apply this wrapper on a case-by-case basis, since the problematic builtins could be buried deep in other library functions, which wouldn’t be obvious up front. (Unless some static analysis tools can be configured to check for this? I haven’t had the chance to delve into the world of Nix linters yet, so maybe there is!)

1 Like