Properly manage and lock versions of nixpkgs and mach-nix for environments created via mach-nix env command.
Add example on how to use mach-nix with jupyterWith
Improvements
Improve portability of mach-nix env generated environments. Replace the platform specific compiled nix expression with a call to mach-nix itself, which is platform agnostic.
Mach-nix now produces the same result no matter if it is used through flakes or legacy interface. The legacy interface now loads its dependencies via flakes.lock.
Fixes
mkDockerImage produced corrupt images.
non-python packages passed via packagesExtra were not available during runtime. Now they are added to the PATH.
remove <nixpkgs> impurity in the dependency extractor used in buildPythonPackage.
add argument ignoreCollisions to all mk* functions
add passthru attribute expr to the result of mkPython, which is a string containing the internally generated nix expression.
add flake output sdist, to build pip compatible sdist distribution of mach-nix
Fixes
Sometimes wrong package versions were inherited when using the nixpkgs provider, leading to collision errors or unexpected package versions. Now, python depenencies of nixpkgs candidates are automatically replaced recursively.
When cross building, mach-nix attempted to generate the nix expression using the target platform’s python interpreter, resulting in failure
Package Fixes
cartopy: add missing build inputs (geos)
google-auth: add missing dependency six when provider is nixpkgs
I implemented an alternative way of fetching python packages that doesn’t require an index or a dependency database. It is a fixed output derivation which just uses pip. Reproducibility is ensured by a local proxy that filters pypi.org responses via date, to provide a snapshot-like view on pypi.
The Disadvantages I see:
all packages have to be re-downloaded after each change in requirements
outputHash needs to be updated after each change in requirements
The Benefits I see:
doesn’t require to store big index files (They can be annoying as flake inputs etc.)
Maybe a tool could be built around this which allows similar comfort like mach-nix, but without a lot of its complexity. (packages could somehow be cached locally, to solve the re-downloading issue, etc.)
Not having to maintain the resolver and pypi crawlers would be a big game changer I think.
Just using pip directly is a lot easier than trying to imitate its behavior.
Recently I have undertaken some major refactoring of the crawler architecture which is about to be finished.
This happened outside the mach-nix repo, namely in pypi-deps-db and nix-pypi-fetcher.
As you may know, mach-nix depends on both these projects being updated regularly to be able to compute dependency graphs and fetch packages reproducibly.
The motivation behind the changes were:
improve maintainability
add data for python 3.9 and 3.10
simplify the process of introducing new python versions
remove any non public infrastructure parts
remove the requirement of trusting in me hosting the crawlers
make it easy for people to fork and maintain their own data
The following changes have been made:
remove the requirement of an SQL database. All update cycles now operate directly on the json files contained in the repo.
both projects contain a flake app that updates the data on a local checkout.
both projects contain a github action cron job that updates the data regularly
python versions can be added / removed by slightly modifying the flake.nix
a new directory ./sdist-errors is added to pypi-deps-db, containing information about why extracting requirements of a specific sdist package failed.
If the projects are forked on github, the data should continue to update itself without further interaction as the workflow file will be forked with the project.
On any non-gitub CI system it should be as simple as installing nix with flakes and then executing the included flake app regularly to keep the data updated.
The newest version of pypi-deps-db now supports python 3.9 and 3.10 while 3.5 was removed.
I still kept python 2.7 despite it being EOL. My gut tells me there is still too much software around depending on it. Does anybody still need 2.7?
(Despite this change being backward incompatible, I did not bump the major version since everything flakes related should be considered experimental anyways)
Improvements
Mach-nix (used via flakes) will now throw an error if the selected nixpkgs version is newer than the dependency DB since this can cause conflicts in the resulting environment.
When used via flakes, it was impossible to select the python version because the import function is not used anymore. Now python can be passed to mkPython alternatively.
For the flakes cmdline api, collisions are now ignored by default
The simplified override interface did not deal well with non-existent values.
Now the .add directive automatically assumes an empty list/set/string when the attribute to be extended doesn’t exist.
Now the .mod directive will pass null to the given function if the attribute to modify doesn’t exist.
Fixes
Generating an environment with a package named overrides failed due to a variable name collision in the resulting nix expression.
When used via flakes, the pypiData was downloaded twice, because the legacy code path for fetching was still used instead of the flakes input.
nix flake show mach-nix failed because it required IFD for foreign platforms.
For environments generated via mach-nix env ... the python command referred to the wrong interpreter.
When checking wheels for compatibility, the minor version for python was not respected which could lead to invalid environments.
Some python modules in nixpkgs propagate unnecessary dependencies which could lead to collisions in the final environment. Now mach-nix recursively removes all python dependencies which are not strictly required.
Package Fixes
cryptography: remove rust related hook when version < 3.4
I’m currently experimenting with python’s import system.
The goal is to allow packages to have private dependencies which are not propagated into the global module scope. If this works, we could build python environments containing more than one version of the same library. This in turn would make dependency resolution trivial/unnecessary and could solve the patching madness in nixpkgs.
I somehow cannot believe that nobody ever tried this, but I could not find such attempts online. In case anybody knows about such attempts or has any input regarding this, I’d appreciate it.
Thanks for that. I have actually taken a look into your approach @costrouc a while ago and it was definitely inspiring. I forgot to mention that earlier. Now I am planning to implement something that doesn’t require any modification of library code.
My current idea is to replace builtins.__import__ which is called on every import. This new import function would then inspect the callers location. Depending from which location it is called, it chooses from a different set of dependencies (every package would bring its own site-packages). Each imported module would get a new unique name in sys.modules to prevent clashes.
My goal is to get rid of sys.path/PYTHONPATH completely and only use the new style of packaging/importing.
The system would be smart enough to detect if two modules depend on the same version and only instantiate that module version once.
In the last few days, I have already implemented something similar via importlib’s PathFinder and FileFinder. But that was too hacky and fragile. Later I found out it is possible to just override builtins.__import__. This should make it easier.
So now I’m starting over with a blank page and thought I reach out to you guys first, to prevent ending up in another If-I-had-only-known situation.
I haven’t seen the discussion on the python forum so far. That is definitely interesting.
Conda support is now merged into master. By default it is disabled, but can be enabled by adding conda to the providers like this for example:
proviers._default = "conda,wheel,sdist,nixpkgs"
or by passing a requirements.yml content to requirements which will automatically enable the provider.
Supporting conda required me to implement a custom requirements parser, as the new format allows both pip and coda formats and even mixes of these.
Therefore, even if you don’t use conda, there is a chance that you might discover a bug. There is extensive unit testing on these changes, but edge cases could still come up. I might leave this on master for a bit longer and see if anything gets reported by you guys.
conda-forge is included by default. Just add it to the providers (like "conda-forge,conda,nixpkgs"). But good point, docs are missing for some of the new stuff ;). In the mean time, scroll up this topic to when conda beta was released. There are some examples included.
Is there any way to do mach-nix and have the resulting env use a modified or custom python. Something like python = (enableDebugging pkgs.python39); in the call to mkPythonShell? I want to set up an environment where I can use cygdb in cython and that requires a debugging python.
You can make an overlay for nixpkgs which replaces python39 with your debugging python. Then import nixpkgs with that overlay and pass it to mach-nix during import via pkgs argument. If you need further assistance, feel free to open an issue on github.