Allowing Multiple Versions of Python Package in PYTHONPATH/nixpkgs

TLDR; I wanted to get feedback on a potential feature that allows multiple versions of the same python package to be installed in the same PYTHONPATH . This is a general approach that is not specific to nixpkgs and could be used in other package managers. The only nix specific part is the tooling to allow for the building of these specialized packages. All of the materials/demo is in this repo GitHub - costrouc/python-multiple-versions. (Also posted in discuss.python.org Allowing Multiple Versions of Same Python Package in PYTHONPATH - Packaging - Discussions on Python.org).

Demo of Multiple Python Versions

This is a self contained demo of having multiple versions of a python
package in the same PYTHONPATH. It requires
nix (sorry no windows support in nix). This
idea is not nix specific but would rely on package managers/builds to
allow for multiple versions.

$ nix-shell
...
[nix-shell:~/p/python-multiple-versions]$ python
Python 3.7.4 (default, Jul  8 2019, 18:31:06) 
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import foobar; foobar.foobar()
I am using flask version 1.0.3
>>> import bizbaz; bizbaz.bizbaz()
I am using flask version 0.12.4
>>> quit()
$ echo $PYTHONPATH
...:/nix/store/f3j11lk2m8ddw2j2axvcdfc2al2bk98c-flask-0.12.4/lib/python3.7/site-packages:.../nix/store/wv42si07c8wd64ravd4va4kh4j7prwlk-python3.7-Flask-1.0.3/lib/python3.7/site-packages:...

Motivation

In nixpkgs we like to have a
single version of each package (preferably latest) with all packages
compatible with one another. Often times it is true that two packages
may be incompatible with one another but if it is a compiled
library/binary we have luxury of rewriting the shared library path
allowing two packages that use different versions of a package to
coexist. In python this philosophy breaks down because all packages
are specified in the global PYTHONPATH. This means that if a package
requires import flask it searches the path for flask and uses the
one that it finds.

For nixpkgs this is troublesome because it prevents all packages from
being compatible with one another.

Examples of Issue

  1. jsonschema. jupyterlab_server
    requires jsonschema >= 3.0.1 and cfn-python-lint did not
    support jsonschema 3 until about a month
    ago
    . 3.0
    was released in
    February
    !

  2. Some packages fix the version of a package such that other packages
    in the same PYTHONPATH cannot depend on the latest version. For
    example apache-airflow fixes pendulum ==
    1.4.4
    . That
    pendulum release is over 1.5 years old and
    libraries.io reports that
    400+ packages depend on pendulum. We cannot let a single package
    restrict the version of other packages.

How does this work?

I wrote a tool python-rewrite-imports that helps to make multiple versions possible. Lets say that package bizbaz wants an old version of flask==0.12.4 but we have another package foobar that requires the latest version of flask>=1.0. Normally these two packages would be incompatible. In order to do this we:

  1. Create a build of flask for 0.12.4 and install
  2. Use Rope to rewrite all the imports of flask of itself to flask_0_12_4_1pamldmw2y7g and rename the package to flask_0_12_4_1pamldmw2y7g
  3. Rename the dist in site-packages and move the package to flask_0_12_4_1pamldmw2y7g
  4. Rewrite all imports of flask in bizbaz to flask_0_12_4_1pamldmw2y7g

Rewriting all imports is done with
Rope a robust python
refactoring tool.

Current Limitations

  • Wanting several versions of a package that builds c-extensions
    looks a little hard than rewriting the imports?
  • Suppose package A requires C==1.0.0 and B requires
    C>=1.1. Let’s say that package B calls a method in A with a
    structure built from C>=1.1 and then A proceeds to call its
    package C with that data. This will probably not happen often.
  • Rope does not handle all rewrites currently in
    python 3. Expressions within fstrings are the only example that I
    know of.
  • It is impossible for Rope to handle all import rewrites. For
    example. import flask; globals()[chr(102) + 'lask'].__version__

I believe for the vast majority of packages that require multiple
versions these issues will be rare.

8 Likes

Many thanks for this! Another possible use is name clashes. For instance python-jsonrpc-server (available in nixpkgs) clashes with json-rpc (not currently packaged), as both want to use the jsonrpc module name. (EDIT: Previously I said json not jsonrpc)

1 Like

What a great point. Didn’t think of that thanks!

Update: I’ve been tearing my hair out with that one and I think python-jsonrpc-server have actually done the rename (great) but appear to have released the fix with the same version number! Changing the checksum solves my problem :crazy_face:

Great work. Rewriting AST is great way to fix design decisions at odds with Nix and I would like to see it more often in Nixpkgs.

Would not renaming it at point of import fix that?

import flask_0_12_4_1pamldmw2y7g as flask
1 Like

Yeah actually this might be an even better approach. It doesn’t require rewriting the ast as much. And is guaranteed to not collide with the original. Hmm I’ll look into this since it would require a less complex ast tool and would feel less destructive

This approach will still provide a unique entry in sys.modules while providing a functioning variable name in the module doing the import. There is still the case of importlib that cannot be covered for which something like shims are needed. Other than that, this could remove the need for wrappers or injecting code in entry points.

c-extension will be harder. When you have two versions of the same/similar library, its symbols would clashes when loaded by the link-loader. It might be possible to hide symbols and only use dlopen/dlsym directly specify the python library entrypoints but you will see problems again, once you those libraries will have incompatible dependencies.

2 Likes

@Mic92 Might be possible to use objcopy to rename symbols in order to prevent symbol clashes.

The .so-file that you import into Python must be renamed to match the new name. Also, you need to change the symbol “PyInit_modulename” to contain the new module name. Not sure if there are any other things.

Not sure if objcopy works on MacOS though.

And if some Python-code in the library or outside is trying to reference symbols in the binary directly, then this Python-code would also need to be patched.

objcopy is not able to rename symbols in the .dynsym section of shared libraries. The renaming has to take place in the object files before they are linked together.

An attempt to implement symbol-renaming in shared libraries in objcopy was made 15 years ago - but the author eventually gave up, as it turned out to be harder than anticipated.