Python builds keep *.pyc in store - dockerTools

While building python docker images with mach-nix and dockerTools, I observed while analyzing the final image with dive, that all __pycache__ folders are present in the nix-store together with there *.pyc files.

This essentially ± doubles the image size. Is this something to be tackled by how python packages are build in nix or on the dockerTools level?

Or anywhere else altogether? (Where?)

it’s a side effect of testing, the installed package is used, so python’s normal behavior of generating the py[co]'s is done

Thanks, I was almost suspecting this.

In your personal opinion, where should I seek to fix (better: improve) this?

I don’t think there is anything being done to address cleaning of binary files on wheel installation. If you’re using the nix dockerTools, you should get roughly a layer per package so the actual cost should be pretty low if you’re generating them all through nix’s dockerTools.

To anwser your question about 'improving" this. There’s some trade offs, pyc’s are faster to load, but obviously increase image size. I don’t think there’s been anything for removing them in their entirity. There has been some work in re-compiling them for reproducibility though


Ok, thanks! That is some interesting input!

Since when doing local debugging, I usually have bind mounts for interpreted languages, therefore py[oc] shall prevail over py before my eyes.

I now tend to think this is something that dockerTools might want to contemplate (remove non compiled files from interpreted languages)?

I now tend to think this is something that dockerTools might want to contemplate (remove non compiled files from interpreted languages)?

An I wonder how to proceed - given my current skillset and should I want to push things?

if you wanted to remove the generated files for all packages, you would probably need to create a hook, and override buildPythonPackage to use the “removeBinCodeAfterInstallHook” (or whatever you want to call it) by default.

The main downside of this, is that you would essentially be diverging from master, and will have to build all python packages locally.

Here’s a PR for adding a python hook https://github.com/NixOS/nixpkgs/pull/90208/files , you could probably upstream a “removeBinaryOutputHook” (or something better) which by default doesn’t get included, and then in an overlay override buildPythonPackage to then switch the hook to being used by default.