Hi guys, I’ve been working on this for a while now,
would love to hear your thoughts
Some of us know that python packaging is hard:
For packaging Python we currently have some options:
There is a third, it helps you create a Python environment with dependencies from https://pypi.org/
It works like this:
You start with a list of requirements you want to install:
# /path/to/requirements.yaml
aioextensions: "*"
Django: ">3.2"
You use Makes for generating all the information required to package the dependencies on Nix.
The process is automatic, just execute:
$ m github:fluidattacks/makes@21.09 /utils/makePythonPypiEnvironmentSources \
"${python_version}" \
/path/to/requirements.yaml \
/path/to/sources.yaml # This file will be generated
The generated file is a bunch of links and hashes:
$ cat /path/to/sources.yaml
links:
- name: Django-3.2.6-py3-none-any.whl
sha256: 04qzllkmyl0g2fgdab55r7hv3vqswfdv32p77cgjj3ma54sl34kz
url: https://pypi.org/packages/py3/D/Django/Django-3.2.6-py3-none-any.whl
...
Then use in your project:
# /path/to/default.nix
let makes = import "${builtins.fetchGit {
url = "https://github.com/fluidattacks/makes";
rev = "7a2b256168a7a5b58cf79d9383e5b463dae9c5e5";
}}/src/args/agnostic.nix";
in makes.makePythonPypiEnvironment {
name = "example";
sourcesYaml = /path/to/sources.yaml;
}
nix-build
source
the output and now your shell has available the specified packages!
Good things:
works on Linux and MacOS
every dependency installer is fetched and cached separately (fixed-output derivations)
it’s secure against supply-chain-attacks (hashes everywhere)
but you never compute hashes manually
works with --option sandbox true
you can specify any dependency version of the packages you like
you can use packages that are not yet on nixpkgs
the generator script checks for dependency conflicts
the environment is fully pinned and stable, even if you start with lax selectors like pkg==*
or pkg>=1.0
So that’s it, we are using this in production at app.fluidattacks.com (see an example here )
Would be nice to hear your thoughts! bye
4 Likes
There is another option: GitHub - DavHau/mach-nix: Create highly reproducible python environments which seems to be pretty good (but haven’t had to use personally, left a python job before I was aware of it).
Also, you can use nix-template to ease some of the one-off packages:
$ nix-template python -u https://pypi.org/project/libagent/ --stdout
Determining latest release for libagent
{ lib, buildPythonPackage, fetchPypi }:
buildPythonPackage rec {
pname = "libagent";
version = "0.14.2";
src = fetchPypi {
inherit pname version;
sha256 = "62aae671df342923475323cf0677bfcef796cc48e6989039a20f29c8e4a9e5b6";
};
propagatedBuildInputs = [ ];
pythonImportsCheck = [ "libagent" ];
meta = with lib; {
description = "Using hardware wallets as SSH/GPG agent";
homepage = "http://github.com/romanz/trezor-agent";
license = licenses.CHANGE;
maintainers = with maintainers; [ jonringer ];
};
}
5 Likes
brogos
August 22, 2021, 8:35pm
3
Mach-nix is working very well to me. I tried Poetry2-nix too and it worked for me but, unfortunately, it’s not using wheels anymore even with preferWheel = true
.
1 Like
brogos
August 23, 2021, 12:26am
4
@kamadorueda I needed to do a modification to nix-build work:
let
makes = import "${builtins.fetchGit {
url = "https://github.com/fluidattacks/makes";
ref = "refs/tags/21.09";
}}/src/args/agnostic.nix" {};
in
makes.makePythonPypiEnvironment {
name = "example";
sourcesYaml = ./sources.yaml;
}
But with this requirements.yaml:
Cython: "*"
matplotlib: ">=3.2.2"
numpy: ">=1.18.5"
opencv-python: ">=4.1.2"
Pillow: "*"
PyYAML: ">=5.3.1"
scipy: ">=1.4.1"
tensorboard: ">=1.5"
torch: "==1.7.0"
torchvision: "==0.8.1"
tqdm: ">=4.41.0"
seaborn: ">=0.11.0"
pandas: "*"
thop: "*"
pycocotools: "==2.0"
And creating sources.yaml
this way:
m github:fluidattacks/makes@21.09 /utils/makePythonPypiEnvironmentSources "3.8" $PWD/requirements.yaml $PWD/sources.yaml
I’m having this problem:
❯ nix-build
error: hash mismatch in file downloaded from 'https://files.pythonhosted.org/packages/1f/bb/5d3246097ab77fa083a61bd8d3d527b7ae063c7d8e8671b1cf8c4ec10cbe/colorama-0.4.4.tar.gz':
specified: sha256:16w62sm95hmh55rqxn4zwdz0bkh3fqm1qnz9cwi3s510iasb4har
got: sha256:05kc902fcqc4xpzj9ph08ia52dzyc9rpdnn855syy7i3fc4fdxc3
(use '--show-trace' to show detailed location information)
1 Like
I just checked this by downloading the file outside of Nix and computing the hash.
specified: 16w62...
is correct.
This is how I was able to pack your dependencies, I added some flags to nix-build to reduce the tarballs cache ttl so hopefully your hash mismatch goes away. In my machine I don’t get the hash mistmatch
Full Github Gist here
Let me know if it works for you
Thanks for trying the tool and the feedback!
https://gist.github.com/kamadorueda/3a6c7250cd10eab99f0e1eb53d857adf
1 Like
brogos
August 23, 2021, 2:19pm
6
Thanks @kamadorueda , but now I’m having this problem:
❯ nix-build --show-trace --option tarball-ttl 1 --option narinfo-cache-negative-ttl 1 --option narinfo-cache-positive-ttl 1 environment.nix
fatal: couldn't find remote ref refs/heads/master
error: program 'git' failed with exit code 128
… while fetching the input 'git+https://github.com/fluidattacks/makes?rev=801523692d3e09c3f95884ad004ad5786c4f3368'
I’m using NixOS-Unstable with flakes activated.
1 Like
that one is sad, yeah
for some reason builtins.fetchgit uses git from the OS (it’s not selfcontained)
this is more portable:
makesSrc = nixpkgs.fetchzip {
url = "https://github.com/fluidattacks/makes/archive/801523692d3e09c3f95884ad004ad5786c4f3368.tar.gz";
sha256 = "0xflpvwpz8l67wzlvm5xz6vp8gbbcbkgwpi8q8z4mbmr1wzp0kh6";
};
1 Like
Probably the thing you are trying to use a wheel for doesn’t have a wheel available? Probably due to using a newer version of Python.
As the option name implies it’s about preference of a wheel, but if a compatible one can’t be found we’ll still build from source.
Our wheel tests are still passing so I suspect it’s just a misunderstanding of what the option does.
Poetry2nix is way more focused on 100% correctness than most other tooling in the space is.
1 Like
brogos
August 23, 2021, 3:17pm
9
Hi @adisbladis with this pyproject.toml
:
[tool.poetry]
name = "inv_stats"
version = "0.1.0"
description = ""
authors = ["brogos <brogos@gmail.com>"]
[tool.poetry.dependencies]
python = "^3.8"
duckduckpy = "^0.2"
python-whois = "^0.7.3"
lxml = "^4.6.3"
requests = "^2.25.1"
geoip2 = "^4.1.0"
pandas = "^1.2.4"
[tool.poetry.dev-dependencies]
ipython = "^7.23.1"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
and shell.nix
:
{ pkgs ? import <nixpkgs> {} }:
let
myAppEnv = pkgs.poetry2nix.mkPoetryEnv {
projectDir = ./.;
preferWheels = true;
};
in myAppEnv.env
It compiles numpy
, cython
and pandas
.
1 Like
brogos
August 23, 2021, 3:18pm
10
I’m having that problem with hash mismatch again:
❯ nix-build --option tarball-ttl 1 --option narinfo-cache-negative-ttl 1 --option narinfo-cache-positive-ttl 1 environment.nix
error: hash mismatch in file downloaded from 'https://files.pythonhosted.org/packages/ec/30/8707699ea6e1c1cbe79c37e91f5b06a6266de24f699a5e19b8c0a63c4b65/Cython-0.29.24-py2.py3-none-any.whl':
specified: sha256:11c3fwfhaby3xpd24rdlwjdp1y1ahz9arai3754awp0b2bq12r7r
got: sha256:18c7r4nb3j8ymcrylf6hg0nlsg7a4ybckwm644ksb597gw8mrfpn
(use '--show-trace' to show detailed location information)
1 Like
I’m not able to reproduce the bug on my machine, the CI/CD linux/macos machines, or other devs machines
Reading around the following may help:
rm -rf ~/.cache/nix
nix-build --option tarbal-ttl 1
builtins.fetchurl -> nixpkgs.fetchurl
for (3) I made a small modification to the framework:
makesSrc = nixpkgs.fetchzip {
url = "https://github.com/fluidattacks/makes/archive/1f535fdedafce35a339ae0ac8baffb8ba3c689db.tar.gz";
sha256 = "0f88sxrvbzl75kvm8d3xsii96cs9vaia037vpwaz1xqhkscy1snf";
};
let me know if it worked
even if it works on our machines today I don’t want this bug to appear later,
so I really want to find a solution that works for all of us including you
thanks!
1 Like
brogos
August 23, 2021, 5:56pm
13
Thanks @kamadorueda ! It created the new env. But there is two other problems:
It not installed numpy, see:
$ python
Python 3.8.11 (default, Jun 28 2021, 10:57:31)
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'numpy'
Numpy is not listed in site-packages:
absl future matplotlib pyasn1-0.4.8.dist-info rsa torch
absl_py-0.13.0.dist-info future-0.18.2.dist-info matplotlib-3.4.3.dist-info pyasn1_modules rsa-4.7.2.dist-info torch-1.7.0.dist-info
cachetools google matplotlib-3.4.3-py3.8-nspkg.pth pyasn1_modules-0.2.8.dist-info scipy torchvision
cachetools-4.2.2.dist-info google_auth-1.35.0.dist-info mpl_toolkits __pycache__ scipy-1.7.1.dist-info torchvision-0.8.1.dist-info
caffe2 google_auth-1.35.0-py3.9-nspkg.pth oauthlib pycocotools scipy.libs torchvision.libs
certifi google_auth_oauthlib oauthlib-3.1.1.dist-info pycocotools-2.0.0.dist-info seaborn tqdm
certifi-2021.5.30.dist-info google_auth_oauthlib-0.4.5.dist-info opencv_python-4.5.3.56.dist-info pylab.py seaborn-0.11.2.dist-info tqdm-4.62.1.dist-info
charset_normalizer grpc opencv_python.libs pyparsing-2.4.7.dist-info six-1.16.0.dist-info typing_extensions-3.10.0.0.dist-info
charset_normalizer-2.0.4.dist-info grpcio-1.39.0.dist-info pandas pyparsing.py six.py typing_extensions.py
colorama idna pandas-1.3.2.dist-info python_dateutil-2.8.2.dist-info tensorboard urllib3
colorama-0.4.4.dist-info idna-3.2.dist-info past pytz tensorboard-2.6.0.dist-info urllib3-1.26.6.dist-info
cv2 kiwisolver-1.3.1.dist-info PIL pytz-2021.1.dist-info tensorboard_data_server werkzeug
cycler-0.10.0.dist-info kiwisolver.cpython-38-x86_64-linux-gnu.so Pillow-8.3.1.dist-info PyYAML-5.4.1.dist-info tensorboard_data_server-0.6.1.dist-info Werkzeug-2.0.1.dist-info
cycler.py libfuturize Pillow.libs requests tensorboard_plugin_wit _yaml
dataclasses-0.6.dist-info libpasteurize protobuf-3.17.3.dist-info requests-2.26.0.dist-info tensorboard_plugin_wit-1.8.0.dist-info yaml
dataclasses.py markdown protobuf-3.17.3-py3.8-nspkg.pth requests_oauthlib thop
dateutil Markdown-3.3.4.dist-info pyasn1 requests_oauthlib-1.3.0.dist-info thop-0.0.31.post2005241907.dist-info
pytorch
needs libstdc++.so.6
. I tried to add nixpkgs.stdenv.cc.cc.lib
to searchPaths.bin
but it not worked.
1 Like
Awesome!
I made a modification so numpy and libstd++.so.6 are propagated to the final environment
I can now import torch and numpy, can you?:
let
makesSrc = nixpkgs.fetchzip {
url = "https://github.com/fluidattacks/makes/archive/a2271af3b65e817d66b4e2e9a766ad2c3a0c6d49.tar.gz";
sha256 = "04zbjiv42p444xhfpvzqmzymchwzcrnpd9svhnvkrgzwwvykinqs";
};
makes = import "${makesSrc}/src/args/agnostic.nix" { };
nixpkgs = import <nixpkgs> { };
in
makes.makePythonPypiEnvironment {
name = "example";
searchPaths = {
bin = [ nixpkgs.gcc ];
rpath = [ nixpkgs.gcc.cc.lib ];
};
sourcesYaml = ./sources.yaml;
withCython_0_29_24 = true;
withNumpy_1_21_2 = true;
withWheel_0_37_0 = true;
}
Thanks for the bug report and helping me improve the thing!
2 Likes
brogos
August 23, 2021, 8:04pm
15
Thanks @kamadorueda now it’s working!
1 Like
brogos
August 23, 2021, 8:08pm
16
1 Like
volth
August 26, 2021, 9:34am
17
BTW, as we have so many great tools for managing Python packages, could we make top-level/python-packages.nix
smaller?
Most of the packages there are actually dependencies of tensorflow
, ceph
, searx
, calibre
, … which could be vendored.
Currently, it is difficult to upgrade tensorflow
without causing mass-rebuild, because it requires newer pytest
or even wheel
than one in top-level/python-packages.nix
.
The packages attrset is already has override
parameter, which big python apps like tensorflow could use to upgrade and add packages. This way, top-level/python-packages.nix
could be made smalled and cleaner by turning python+deps in tensorflow
, ceph
, calibre
… into environments with requirement file and pinned versions in lock-file, and keeping in top-level/python-packages.nix
only the most essential packages
1 Like