Help understanding the libGL ABI problem and possible solutions

The purpose of this post is to understand the libGL ABI problem and why it exists, as well as possible solutions to it. If you’re familiar with the issue, feel free to skip the “Background” section. My hope is that people will correct any mistakes I make in the “The Problem” section and also comment on the solutions in the “Solutions” section, as well as add any others I should have included.

Background

I frequently encounter issues such as /nix/store/.../lib/libc.so.6: version 'GLIBC_2.34' not found (required by ...), since I frequently run programs from versions of nixpkgs different from the nixpkgs version used for the current running system. I use numerous development shells for different projects, and these often use different nixpkgs versions than the system nixpkgs version. The frequently mentioned “solution” to this problem of updating the nixpkgs version from either the system or development shell (usually the development shell) to match the other is not satisfactory to me, as I often have a good reason for using a specific nixpkgs version in a development shell. Moreover, the benefit of Nix development shells to me is that I can come back to a project at any time in the future, irrespective of what I’ve changed on my system, and get the same development environment. The libGL ABI problem prevents this from being a reality.

The Problem

As I currently understand it, the problem is as follows. libGL is loaded impurely in Nix. That is, NixOS sets LD_LIBRARY_PATH=/run/opengl-driver/lib:/run/opengl-driver-32/lib. Then, applications that depend on the libGL shared library use the library under one of these paths.

The libGL library is itself dynamically linked against other libraries. An application can depend on libGL as well as other shared libraries on which libGL also depends. Let’s call this library that the application links against directly as well as indirectly through libGL, library X. When the version of X required by the application and libGL is the same, there is no issue. However, if the application requires a different version of X than that required by libGL, a conflict will result. A very common example of library X is glibc.

This issue commonly occurs on a NixOS system when an application is from a different version of nixpkgs than the nixpkgs used to build the NixOS system. In this case, the application may require a different version of library X than that required by libGL, which links against the current NixOS system version.

Why does libGL need to be loaded impurely? Why can’t each application load it’s own version of libGL in the nix store just like other shared library dependencies in Nix? Is the problem that this would disrupt caching, since each package would need to built for the specific libGL implementation (eg Mesa, Nvidia, etc)? If so, is there a way to accept this tradeoff? I’d happy recompile every package on my computer and not benefit from caching if it meant avoiding this problem. Or, would it be problematic if two different applications running on the same system linked against different versions of libGL? If so, can someone explain the issue?

Solutions

  1. Statically link libGL. I suppose this would only work when using mesa, or another open-source implementation of libGL. Would this resolve this issue?
  2. Use nixGL. I don’t really understand how this works/avoids the issue, or really how to use it in a development shell.
  3. People frequently mention using LD_PRELOAD, but this seems more like a temporary workaround. Also, I’ve tried using this in the past without success so I’m not totally sure I understand how to do it correctly. I’ve also seen LD_LIBRARY_PATH mentioned.

Relevant links

7 Likes

Yes, the main original reason is that Nvidia users need a different libGL than the others (and it’s even unfree).

As for glibc conflicts, I believe that building against “sufficiently old” glibc would solve things – while running against a new glibc. The thing is that glibc tries very hard to provide ABI compatibility, but it’s one-way compatibility (upgrading between build-time and run-time).

Thanks for the response. So is it possible to override this behavior and tell nixpkgs to link all packages against a specific libGL of my choosing (in my case this would be mesa, which I use for my AMD GPU)? As stated, I would happily rebuild the world in exchange for purity.

1 Like

Such an option is not implemented currently.

Are you aware of any existing effort on this front? Would it be reasonably feasible? Would it make sense for me to open a feature request in nixpkgs to, at the very least, gauge interest and coordinate efforts?

Considering that people use Nix for hermetic builds and purity, I can’t imagine I’m the only person who would be happy to make this trade.

1 Like

Some efforts have happened over the years, e.g.

but I can’t recall anything hopeful.

For the glibc conflict (and maybe others), wouldn’t it make sense to automatically wrap packages with a code that picks the appropriate glibc, e.g. if the glibc provided with the package is compatible with the system we keep it, otherwise we force using the system glibc? This way we can even imagine some environment variables defining the policy regarding this choice, for instance to force using the package glibc etc.

1 Like

I remember there being issues about this before. Heres one: Support libGL in non-NixOS Nixpkgs · Issue #62169 · NixOS/nixpkgs · GitHub

That’s effectively what libcapsule intends to do.

1 Like

I’m not personally not very hopeful for such approach, as you’d be making the packages even less pure than they’re now.

I think I must be missing something here (and my familiarity with the whole graphics ecosystem is poor, so that’s likely). If I am, please correct my misunderstanding(s).

But, why not just link applications against specific libGLs present under the nix store, just like every other library dependency in Nix? That way, application A could use libGL W (and both depend on glibc Y) and application B could use libGL X (and both depend on glibc Z). And applications A and B could coexist without conflict. In fact, this is one of the primary motivations and selling points of Nix and NixOS as I understand it.

Maybe this is not possible in non-NixOS cases, but those shouldn’t prevent it being done when it is possible.

Perhaps caching is less effective with this solution, but in my opinion, purity is of primary importance in Nix whereas caching is of secondary importance. We can choose just about any other distribution out there if we’re happy with impurity.

Assuming I understand the wrapping solution correctly, I’m not a fan of that either. It sounds like you’re just patching one impurity with another impurity.

The libGL library is essentially an interface to the hardware. At least on nvidia, the version of libGL is tied to the version of the nvidia driver kernel module, so application A and application B cannot use different versions.

If you linked the nvidia libGL to every binary directly, you would achieve purity, but you would need to rebuild the world on every nvidia driver update. Packages built this way wouldn’t work on any machine that isn’t using the exact version of the nvidia kernel module it was built with, so it wouldn’t really be portable even across nvidia machines, not to mention that it obviously wouldn’t work on any other brand of GPU.

To me, a very important benefit of Nix is that if a package works on one machine, it works on (basically) all machines. You would sacrifice this guarantee in practice by introducing a seperate package for every possible GPU driver version for all graphical apps.

2 Likes

Thanks for that explanation. That clarifies things quite a bit and I agree that situation is not desirable.

So, to address the problem of incompatible dependencies (eg application A from an old nixpkgs depends on libGL and glibc vX, whereas the libGL on the current running system depends on glibc vY), what about my previous suggestion of statically linking libGL (as long as you’re using an implementation of opengl you can compile yourself)? Nix makes statically-linking arbitrary packages straight-forward, and I expect doing so for libGL is no exception, though I haven’t tried yet. But, I don’t think this would be possible, at least not without some effort, for the Nvidia implementation, since you’d somehow need to replace the shared library dependencies with state dependencies. Unless I’m mistaken, that would resolve the incompatible dependency problem at the cost of additional memory footprint when using Mesa. One of the nice things about this solution (assuming it works) is that Nixpkgs doesn’t need to make this choice for users - users can choose to do it themselves if that tradeoff is desirable for them.

I suppose this still doesn’t guarantee that an application that works on some Linux/NixOS system would work on a later system, since the libGL API could change. But, it sounds like that API is very stable, so hopefully the likelihood of that is small.

I just tried static linking libGL (pkgs.pkgsStatic.mesa.drivers) and it doesn’t work. Apparently this uses libglvnd and that doesn’t support static linking. Anyone interested can read more about that here. In any event, this isn’t looking like the easy solution I was hoping it would be.

Since glibc is one-way only but remains compatible with programs built against older glibcs, wouldn’t it be possible to always choose the newest glibc in the namespace?

I.e. in the common case where the application is built against an older glibc while the one which the driver links against is newer, you’d use the driver’s glibc for driver aswell as the app.

Yes, possible. Theoretically at least.

1 Like

Do you have any ideas on how one could implement something like that?

I’m not even convinced that it’s a good approach. It would make the impurity worse and it would only really work on glibc and maybe libstdc++ (and we already did have libGL failing because of other libraries).

1 Like

I don’t really understand this argument. To me purity is good to make sure that if program A runs on a machine B, it should also run on machine C… but enforcing too much purity gives here actually the opposite effect : being too pure forbids me from running package A on machine C even if it runs on machine B… What’s the point of being pure if I can’t run my program anyway?

The solution I propose still maintains purity if the system’s glibc is compatible with the binary glibc, so basically if it was working before, it still works after applying my solution. On the other hand, in cases where the program would crash, my proposed solution would make it work.

I think we should keep in mind that purity was created to improve reproducibility, and purity is not, in itself, an end goal. Reproducible predictible crash is not valuable at all. And to reproduce a bug, we can still provide the version of the system glibc as well, I don’t see how it would be less reproducible…