Correctly limiting S3 Bucket size using Lifecycle Policy

I have a cache in S3 that a CI process has been pushing derivations to. This bucket had a lifecycle policy set on it which caused it to delete files after 30 days, and now I am getting errors about certain .nar.xz files being absent from the cache - not surprising, as they probably expired.

Now I have two problems:

  1. How can I repair the bucket’s idea of what derivations it contains? I assume there is some metadata in the bucket that’s now out-of-date, as a fresh CI runner will complain about missing .nar.xz files.
  2. I would like to have some kind of expiration on old derivations so that the bucket doesn’t retain things forever. Is there a safe way to do this?

You’ll need to parse all narinfo files and generate trees of dependencies and make sure to delete all hanging ones.

Alternatively you can push the whole S3 bucket to cachix.org and it will do that for you :slight_smile:

2 Likes

So what I have heard people doing is putting a loadbalancer before their s3 buckets and have multiple s3 buckets. Than instead of a livecycle policy they would delete older buckets and create new ones.
Than I think we used to have scripts in nixpkgs for deleting old derivations from buckets. Maybe @andir has more information on this?