Terminal emulator leaks environment variables to shell

ruro · September 30, 2023, 1:51am

I’ve recently run into the following issue: a lot of apps built with nix use custom wrappers that set the appropriate environment variables needed to run the program (such as PATH, LD_LIBRARY_PATH and more). This is normally fine, however because environment variables are automatically propagated to all subprocesses, this can lead to some nasty issues.

An especially annoying case of this is terminal emulators. After a terminal emulator sets up some environment variables that are required to run the emulator itself, these variables get propagated directly to your shell session.

For example, guake is a terminal emulator written in python. As such, it sets a bunch of environment variables, including PATH and PYTHONNOUSERSITE. As a consequence, the specific version of python that is used by guake ends up having a higher priority than the “system” python that I’ve installed into systemPackages.

Is there some way to make nix applications run using all the “correct” environment variables, but without poisoning my shell session?

And, by the way, the following “solutions” aren’t really viable imho

manually unset the variables you don’t want
completely reset the shell environment

because for (1) I need to somehow know in advance, which environment variables may interfere with other programs (for each terminal emulator I might want to use, for each shell, for each program) and (2) is too disruptive as MOST environment variables SHOULD be propagated from the current “session” (PAM or tty or graphical or ssh or whatever).

abathur · September 30, 2023, 4:34pm

I’m not sure what the “right” thing to do is here and I’m skeptical that there’ll be a 1-easy-trick to fix this universally.

I guess the wrapper generators in nixpkgs could include an option that can also kick out a generic executable wrapper that reverses the environment changes before execing its argument. In cases where this is a concern, we could use the option and then patch the code that creates the subprocess to shim that generic wrapper in where it can clean the environment.

In the guake case that might look like patching it into line 554 here:

github.com

Guake/guake/blob/dcbf64c24e133b57d9023347bd17c13589ebb85f/guake/terminal.py#L554-L562


      
          argv = []
          user_shell = self.guake.settings.general.get_string("default-shell")
          if user_shell and os.path.exists(user_shell):
              argv.append(user_shell)
          else:
              try:
                  argv.append(os.environ["SHELL"])
              except KeyError:
                  argv.append("/usr/bin/bash")

ruro · September 30, 2023, 5:11pm

If I am being honest here, I don’t quite understand why is this a problem in the first place. I was under the impression that “properly” packaged Nix applications should be patched (replacing “unqualified” references to other programs with their full /nix/store paths) and setting PATH should basically only happen during build or with “uncooperative” applications (for example, binary blob applications that verify the checksums of their binaries and so can’t be patched).

As I mentioned in my original post, reverting the environment changes after the fact sounds like an insanely hacky/fragile solution.

Is there really no sane way to fix this?

AndersonTorres · September 30, 2023, 8:50pm

Sometimes applications are not projected to act like they would run in NixOS.
Reusing global environment vars is just a nasty example of this. Ideally an application should not rely on implicit things like “a Bash shell” or “only one Python in $PATH”.

abathur · September 30, 2023, 9:30pm

I agree with the core of your impression, so it may help if you can qualify what kind of scope and approach counts as sane for you here. The initial question was about making applications run with the “correct” environment variables without poisoning your shell session, which I read as a request some kind of user-facing fix. I’m not aware of anything universal and user-facing, so I gave you the closest thing I can imagine working.

Patching out dependencies on the runtime environment is indeed much better with respect to purity, but it doesn’t sound like you’re counting that as a sane fix? Would you have been disappointed if my first answer suggested you could go patch out environment dependencies as deep into the dependency chain you needed to go to get a given application working without leaking envs?

ruro · September 30, 2023, 10:05pm

Well, I kind of hoped that the problem I was experiencing was due to poor packaging of guake in particular and that there is some “simple” (not requiring manual patching and hacky workarounds) way to fix it. Something along the lines of:

Oh, no! The guake package is just doing it wrong™. It declares some of its runtime dependencies using sometimesPropagatedBuildInputs, but it should have been using definitelyNotPropagatedNativeBuildInputs. Then it would stop leaking its dependencies into the environment.

or

Yeah, this is a known issue, but you can fix it by adding a wrapExecWithCleanEnvHook to the nativeBuildInputs. It will patch each build output, replacing system calls to exec with a magic version that “does the right thing” (somehow, idk).

I mean, considering how patchelf and other nix black magic works, these hypotheticals that I am describing aren’t even all that far-fetched.

I do want a “universal” fix. In the sense that I don’t want to hand-craft a solution for this specific case. I want to learn, how this “class” of problems is supposed to be addressed in the Nix ecosystem in general.

I am not sure, what is the distinction between a “user-facing fix” and any other kind of fix. I am open to any solution, even if it involves modifying the derivation or even (minor) patching of the package source (something that could be upstreamed to the original package source, not “rewrite it in some language other than python”).

I would count that as a sane fix, if it can be done in a “reasonably robust way”. In particular, I would like to minimize 1) the possibility, that I missed something that should be patched and 2) the possibility that these patches will stop working after a source version bump.

So “you have to read the whole source code of guake and find all the places where bad_thing_X is done and manually replace it with good_thing_Y” is not “viable” for me, But any kind of automated or semi-automated patching (as a general technique) is definitely “valid” imho.

uep · September 30, 2023, 10:05pm

The way to fix this is in the terminal emulator itself, unfortunately.

It will need to grow a feature whereby you can specify the environment for the processes it starts. It’s probably not a common feature, because the processes most commonly started in terminals (shells) typically also manipulate the environment for themselves and subprocesses.

There are also some tools used to sanitise an environment, for security or as part of build pipelines, that can act as a mostly transparent wrapper, so you could tell the terminal emulator to use one of those around your shell.

ruro · September 30, 2023, 10:20pm

From the point of view of the emulator, it’s already doing the right thing™. Propagating environment variables from the current session to the shell is almost always the correct thing to do (arguably - ALWAYS the correct thing to do). A lot of the environment variables are set by the other components of the session (pam, xserver, the desktop environment, etc). By the time the terminal emulator gets invoked, the environment has a lot of useful and even necessary information in it.

I am 99% sure that any terminal emulator author would reject such a “feature” on the grounds that this is well outside the scope of what a terminal emulator is supposed to do. And “rolling my own” terminal emulator just to work around this issue is not an option (for pretty obvious reasons).

TLATER · October 1, 2023, 2:54am

The problem is quite different here. Nix cannot patch the terminal environment at runtime. patchelf is indeed cool black magic, but it’s possible because it can be applied automatically at build time.

The underlying problem is that environment variables need to be set at all. I think this is fairly unique to the terminals built in python or using gtk, I believe all others can suitably be fixed by patchelf and the occasional substituteAll for terminals that shell out to things for some reason.

However python packaging in nixpkgs relies on environment variables to figure out where modules live. I believe in theory this could be changed with python init scripts, which would need to be auto-generated based on what your withPackages does. It’s tricky, and would probably require reworking a lot of what the python stuff does.

Even then I’m not sure if you would still need an environment variable to change the init script location. Ultimately some of these ecosystems are just very environment variable reliant. I’m sure there’s at least one which cannot be convinced to load modules from anywhere but a prefix set at build time, which would then require rebuilding the thing every time you set new modules. Since that would be utterly unreasonable, I don’t think a generic solution exists, short of quite heavy manual patching.

It’s probably worth looking into whether this can be solved at least for python though. Launching things with python is a common enough use case that we should probably avoid polluting environments whenever a nix-built python is used.

abathur · October 1, 2023, 4:51am

Right. In that case I can stand by my statement that I doubt there’s a 1-easy-trick here. (I’m not aware of one, and I’m fairly confident we’d need distinct solutions for different languages/toolchains/ecosystems.)

patchelf is a good example. Its powers are bounded by the binary formats it understands (and by the experience/expertise of the few dozen people who’ve worked on it over many years now). Thankfully, targeting binary formats is a pretty good place to start. A big bite out of the problem space.

When we move beyond binary formats into interpreted languages/toolchains, I’m fairly sure we’ll need different tools for them. Each one will be a separate bite out of the problem space. (I guess there’s some chance such tools can share an interface, but I suspect this would only work if the tool’s scope is very tight.)

I can spitball a little about how I perceive the playbook. (Fair warning: my knowledge isn’t very broad. I have perspective from developing resholve to try to meet this need for Shell/Bash packgaging with Nix, but that’s just a small corner of the “class”. List may be incomplete.)

Patch the interpreter.

If we’re lucky, whatever we need to square with the Nix model can be straightforwardly patched in the interpreter in a way that won’t create lots of splash damage or a huge maintenance burden.

I’m not sure, but I imagine the situations where this is viable will be somewhat limited by the need to support different use-cases. For example, the kind of clean encapsulation we’d like to see from Nix-packaged software may conflict with using Nix to supply the same program as a dependency in someone’s development environment where user may indeed want it to be overridden by whatever’s in that environment.
Patch the interpreted source.

This might look like just injecting some code at the top or bottom, a few bits of search/replace, or even extensive source rewriting.

The Python tooling already does a bit of the first to inject a module search path. I do something more like the last in resholve to replace bare invocations with absolute store paths.

The more ambitious this gets, the more important it is that you have a good parser and a good understanding of the ecosystem. For example:
- we can’t just inject arbitrary code at the top of a Python script. Any imports from __future__ have to come first, so any tooling here has to be at least sophisticated enough to avoid breaking something like this.
- we can’t just blindly rewrite grep → /nix/store/...-grep-.../bin/grep in a shell script; there could of course be a shell function or an alias named grep
Wrap executables.

Some of the problems with wrappers can be worked around, so I wouldn’t toss them out of the kit over it. We can do things like move variables for “our” needs out into a NIX_-prefixed env and then patch the interpreter/executable to merge our env with the canonical one and perhaps exclude them when spawning subprocesses if helpful.

I was just trying to distinguish between something a user who isn’t a serious contributor could plausibly take advantage of (by, say, just adding some hook) and something we can theoretically achieve but which will require significant novel work that encodes a lot of domain knowledge.

I suspect the specific case of guake would entail ~fixing these problems for Python. I’m not sure if he’ll notice, but maybe @fridh has thoughts on whether parts of this are tractable and just in need of work, or if there are good reasons we can’t.

I’m not sure about the full-throated version of the problem in a Python context, but I can at least reflect on narrower problem of running executables from PATH in Python code based on my experience with resholve:

It should be at least somewhat feasible to patch interpreted source to replace these with absolute store paths, but this almost certainly can’t handle 100% of cases.
Simple/static cases like directly passing the string "grep" to a well-known exec API are probably very tractable.
Beyond simple cases will be a long tail of harder problems:
- code that aliases/renames exec APIs (e.g. to dynamically choose among them)
- code that uses these APIs indirectly through other modules/libraries
- code that invokes variables/strings created dynamically at runtime, whether through iteration/conditionals, through format strings, from user input, etc. (and rewriting these strings to be absolute paths might also affect every other statement that uses them)
- code that evals
- code that dynamically generates other scripts (Python, Shell, or something else), which it then tries to run

Whether the work to make tooling like this is justified by the leverage it creates probably depends on the size of the ecosystem you’re making it for (and how well-represented it is in the Nix community). I imagine Python’s big enough to justify it?

I wouldn’t say all is lost below that floor. Something else we can do here is scan source (and even binaries) for signatures of behavior like this but then use human triage to investigate. I won’t elaborate for now since I’m pretty sure this is outside the scope of what you’re looking for, but I do something like this for resholve if anyone’s curious.

abathur · October 1, 2023, 5:08am

I won’t attest to any specific thing they’ll communicate, but if you’re sufficiently curious you may find discussion in these issues/PRs interesting:

General problems with environment variable wrappers · Issue #60260 · NixOS/nixpkgs · GitHub
Python: add sitecustomize.py, listen to NIX_PYTHONPATH by FRidh · Pull Request #64634 · NixOS/nixpkgs · GitHub
Python: add sitecustomize.py, listen to NIX_PYTHONPATH and NIX_PYTHON_SCRIPT_NAME by FRidh · Pull Request #25985 · NixOS/nixpkgs · GitHub
wrapPythonProgram: use site.addsitedir instead of PYTHONPATH by abbradar · Pull Request #17749 · NixOS/nixpkgs · GitHub
[RFC] Python library wrappers by timokau · Pull Request #53816 · NixOS/nixpkgs · GitHub
setting LD_LIBRARY_PATH breaks meld, ansible, possibly any Python app in Nixpkgs that use lib-dynload · Issue #9186 · NixOS/nixpkgs · GitHub

Apologies to any serial participants that just got 2-6 pings…