Recommendations for introducing a shared nix store or cache for CI/CD and development

ah yes, i’d forgotton about that, i did a bit of testing with it, and it was pretty good. It might fit your needs perfectly…

so little time, so many tools!

1 Like

Thanks @toraritte for the suggestion. I do recall seeing the announcement (but had completely forgotten about it, since it was in beta).

The situation for me though is that for myself and developers on my team we on darwin (aka macOS), whereas our gitlab CI/CD naturally uses linux docker images to build our software (which includes: springboot services, react js apps, db migration tools, cloud iac etc). A number of developers in our company are also on windows (none of whom are using nix).

So I suppose there’d be limitations on what could be shared between developers and CI/CD pipelines.
But flox definately looks good.

2 Likes

You can make these non-native runners work, but it’s not ideal. Even keeping then up and running can be a chore, as nix doesn’t allow auto code updates, runners expect a system where they can bump and change the code without warning. Run an old runner, and it breaks the API and github at least rejects it, and it’s difficult to automate it as you have to rekey the runner and bump the version manually… are real PITA. Having a broken pipe line is no fun.

so, try and setup your own runners, and don’t mess with the current pipe line.

Nix gives you a lot of cool stuff, but most developers don’t even think they need what nix has to offer.

and i just figured out the self hosted runners from github, are written in c# .net , so i guess the microsoft borg assimilation of the github is starting. I’m not sure how i feel about that. Maybe i shouldn’t feel anything…

1 Like

So I’ve tried bind mounting a shared volume’s directory in the gitlab job…

  before_script:
    - set -x
    - mkdir -p /usr/share/gitlab/.nix/store
    - mount -o bind /usr/share/gitlab/.nix/store /nix/store

And am getting this error, of course,

mount: permission denied (are you root?)

Any suggestions?

My recommendation is to keep it simple

Mounting /nix/store on the runners is complex, I have personally tried that, and it’s not as shared nor saas as you would want it to be.

On the other hand, Cachix works out-of-the box at an excellent cost. In my case, selling it to my company was easy: hey! we are in the business of cibersecurity, not maintaining binary caches. The less time we spend doing binary caches, the more time we spend adding value to our customers. Cachix is the way to go: just cachix use, cachix push, and enjoy.

However, if for business constraints you still require something else the next simpler alternative is using S3. Place the nix copy --to 's3://example-nix-cache?profile=cache-upload&region=eu-west-2' ./result in gitlab’s after_script for writting, and just setup substituters for reading in the before_script. You can add a lifecycle policy to the bucket for automatic garbage collection after objects reach certain age

We currently run our company on Nix and gitlab at a pace of 50-100 daily deployments to production, ~200 jobs per deployment (dev+prod)

We use 100% and only this:

If you want to give it a try, adding support for writting to S3 caches to the framework should be very simple. Reading S3 caches is already possible

6 Likes

@nixinator yeah, it was a pain before, then we started using this:
GitHub - cattle-ops/terraform-aws-gitlab-runner: Terraform module for AWS GitLab runners on ec2 (spot) instances and they now even automatically register themselves into the gitlab’s coordinator

This is the code for our ci/cd infra: makes/foss/modules/makes/ci · 489c1b1462848668332f325914f211e3563f21a5 · Fluid Attacks / universe · GitLab

can’t be cheaper and more scalable at the same time than that

1 Like

I’m going to have to second @kamadorueda. Flox is certainly a valid suggestion, but if you want or need to keep it FOSS then makes is probably the best bet right now since it is designed as a CI framework. It has a single invocation m . __all__ to simply build all defined tasks beforehand and optionally upload them to a cachix (we plan on making this more general and add an option to simply call nix copy soon), so that you only have to build your CI tasks once beforehand, and every runner can just pull from the cache.

2 Likes

Oh, actually that’s a fail anyway as the /nix/store is already in the nixos/nix docker image.

I wonder if it’s possible to layer them, treating the /nix/store from the docker image as r/o

Thanks for the suggestion @kamadorueda. That sounds great.

Motivation / pain points

We were just hoping to save having to download/upload derivations regularly per job, instead of having it mounted and immediately accessible.

So I realise now there’s a few problems with this:

  1. the docker image we’ve used so far (nixos/nix) itself already has a /nix/store. We could potentially look at using layerfs to layer this with a bind mounted /nix/store from the host/network. Perhaps.
  2. The docker container (nixos/nix) would maintain its own store db. Persisting this doesn’t sound trivial (with simultaneous read/writes across jobs)

Bootstrap Container Store?

Perhaps the above could be worked around by creating a docker image that has static nix tools in an alternate location, allowing /nix/store to merely contain fetched/built derivations. But again, the question of a persisted and shared/served nix store with db remains. See @zimbatm’s suggestion on this here.

S3 Store Pro/Cons?

So another thing we’ve considered looking at using is a read/write S3 store. I presume/hope this configuration would involve simultaneous read/write options, but again would involve pushing/pulling as required instead of something that’s immediately available at a mount point.

Longer Term

So longer term I’d want to move to more sophisticated tools like makes as you suggest @kamadorueda.

Short Term

I’m looking for a quick win. The primary caveat atm is downloading artifacts for each job from cache.nixos.org. That’s pretty quick to be fair, but obviously if we can save the time that’ll save us time.

Something like storing/restoring app paths just for our project / branch?
Of course, downloading from cache.nixos.org isn’t the biggest painpoint atm. Just an obvious point of optimisation.

1 Like

i like CI/CD pipe lines with as little moving parts as possible, there’s a of moving parts there, and with every build you do, your making amazon deliveries to the international space a step closer. This might be your plan or it may not.

That’s a nice diagram! very colourful

I think it’s https://www.cloudcraft.co/

1 Like

Very interesting resource! thanks
I just copied the image from their repo’s README
But now I know how to do them!

If I’m not mistaken, the only thing preventing one from having a store shared over NFS is that Nix uses SQLite to store the metadata. Would be interesting to see if that can be replaced with a database that actually supports networking.

3 Likes

Have you looked at using post-build-hook?
See Untrusted CI: Using Nix to get automatic trusted caching of untrusted builds for how to set it up.
It requires very little additional code / moving pieces / infrastructure (just an s3 bucket), and handles downloading / uploading only what is needed.
We found it to be very simple to set up and use, and it behaves exactly as we expect it to.

4 Likes

I’ll chime in on my experience integrating nix to do the CI for a small, private Rust project self-hosted GitLab.

I did this a year ago so details are fuzzy.

I wanted to experiment with a shell runner that invoked a single user nix installation. This acted as an alternative to the daemonized world of Docker.
The performance seemed better. Our project pulls in clang and other system dependencies so nix’s store wonderfully avoiding duplicating those downloads.

Is having this read/writeable recommended? How would people manage garbage collection of packages over time?

This is where I struggled. We self-host the GitLab runner on an EC2 instance and we kept filling the disk space from the size of /nix/store. I would manually run nix-collect-garbage -d to prune packages. Looking back, I probably should’ve created a scheduled pipeline to do that!

In the end I moved away from the Nix solution because I am the only one on my team familiar with Nix and I ran out of energy. I moved to a GitLab Docker runner and various Docker images for our jobs for ease of maintenance.

I’m wondering about my decisions as I investigate optimizing our Rust project’s CI pipeline.

makes looks promising! I also may try my shell runner with nix directly again. This time I’ll implement garbage collection or see if I can use S3 fs or some LInux trickery to mount larger disks for nix to use.

1 Like

We also run a private Gitlab and have tried several different techniques for managing nix caching.

One of the key selling points of cachix is the support for LRU based garbage collection which I am not aware of any other nix caching solution having.

When using S3 as a cache there is no GC available that I am aware of so your cache will just keep growing.

When using SSH the cache machine just needs nix installed and you can then set GC roots on that machine to keep paths and do GC on that machine at regular intervals.

As for populating the cache: You do it much like when using cachix but with nix copy or nix-copy-closure. One difference is that you have to manually sign the paths yourself before uploading which is something that cachix does automatically for you.

We also have a gitlab docker runner that mounts the host /nix/store and nix daemon and uses that for building and as en extra cache. Not quite your shared nix store idea but it has nice job isolation with docker and good host caching.

One way you could have a shared nix store is by implementing a process kind of like what the NixOS ISO does. It contains a nix store and a db dump (nix-store --dump-db) and mounts that read only store together with a writeable layer using overlayfs and then runs nix-store --load-db to initialize the SQLite DB.

5 Likes

Can you elaborate on how you do that? I’m looking for a similar solution but I ended up using shell runner to be able to use the local nix store but its really bad for reproducibility…

My build machine runs NixOS and so I mount /nix, /run/current-system, /etc/ssl/certs/ca-bundle.crt:/etc/ssl/certs/ca-bundle.crt and /etc/ssl/certs/ca-certificates.crt from the host in the runner and then use a small alpine docker image built from this Dockerfile:

FROM alpine:edge
RUN apk add bash curl
ENV \
    ENV=/etc/profile \
    TMP=/tmp \
    NIX_REMOTE=daemon \
    PATH=/run/current-system/sw/bin:/bin:/usr/bin \
    GIT_SSL_CAINFO=/etc/ssl/certs/ca-certificates.crt \
    NIX_SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
CMD ["/bin/bash", "-l"]

This is the part of my Gitlab Runner config.toml that configures docker for this setup:

[runners.docker]
disable_cache = false
disable_entrypoint_overwrite = false
image = "lisberg/ci-host-nix"
oom_kill_disable = false
privileged = false
shm_size = 0
tls_verify = false
volumes = ["/nix:/nix", "/run/current-system:/run/current-system", "/etc/ssl/certs/ca-bundle.crt:/etc/ssl/certs/ca-bundle.crt", "/etc/ssl/certs/ca-certificates.crt:/etc/ssl/certs/ca-certificates.crt", "/cache"]

In my setup I also have a custom helper image but I don’t actually think that is needed any more so I have left it out. If you try to use this setup and you have problems with permissions you might need the custom helper image bit but I am pretty confident that it is no longer needed.

3 Likes

I’ve been lately exploring buildkite.com (free for open source) and I’m very happy with the results

  • You can launch machines on an autoscaling group or in a single machine using NixOS
  • Each machine is configured like this so that N agents run inside N systemd nspawn containters, and by default, all N nsspawn containers share the host /nix/store so it’s very shared and fast
  • Builds are as fast as If I run them locally, the overhead of buildkite, the agents, and the nsspawn containers is imperceptible. I push and the jobs immediately start, all deps already cached, just delicious
2 Likes

@kamadorueda : Do you have an opensource reference how you use that?