Path normalization - source of sneaky mistakes?

Hi,

I just stumbled over an issue a colleague found in our code that was obviously not tested automatically … :slight_smile:

When researching about the way this works I noticed that it’s easy to get confused whether variables in Nix are strings or paths, but can be concatenated using + with different semantics.

I do not have a specific change suggestion but have found a few reported issues that resulted in some explicit fixes around normalization (like [Nix-dev] Using string as path to eg. builtins.readFile)

When trying to investigate how this is implemented I didn’t get very far. Would someone care to point me in the direction where in the Nix code the + operation is defined for path + string and string + string?

Also, is this just me feeling a bit uneasy about not being able to quickly check whether those things are correct by looking at a code base? I looked for the + "/" pattern in the code and see that this can be used in typical situations:

  • ipaddress + “/” + prefix
  • config.boot.kernelPackages.kernel + “/” + config.system.boot.loader.kernelFile;
  • path: path + “/” + subDir (strings.nix - is path a string or a path?)
  • gititShared = with cfg.haskellPackages; gitit + “/share/” + pkgs.stdenv.system + “-” + ghc.name + “/” + gitit.pname + “-” + gitit.version;
  • themebase = string(themesdir) + “/”; (interesting, someone uses an explicit cast here)
  • ${plugin.out + plugin.kodiPlugin + “/” + plugin.namespace }

To me this looks like it’s likely that there is code out there hiding problems that people won’t be able to fix by looking at it …

Any input is warmly welcome. :slight_smile:

I always do (path + "/${otherthing}") when concatenating paths. That works as expected whether the initial path is a true path or a string of some kind.

Right, that’s what came out of the discussion in June last year and solves it reliably on a mechanical level as an idiomatic solution.

Looking at the code base and how people around me that approach the language on a more “user oriented” level work this idiom seems a bit too “surprising” though and I have a hard time trusting that “things won’t go wrong”. Obviously better test coverage helps against those surprises, but practically you never catch all the edge cases, so having a better safety belt here might be worthwhile.

One thing I’m trying to find out is what the reasoning is for the time during processing when normalization happens and whether it might be a good idea to push that to a different (later?) point, i.e. when crossing IO boundaries or some similar delineation.