Bob: A build system for microservices powered by Nix

zuzuleinen · August 26, 2022, 10:52am

Bob is a high-level build system that isolates pipelines from the host system by executing them in a sandboxed shell with binaries installed by Nix to run them anywhere.

Designed to bring back the joy of microservice development.

blaggacao · August 26, 2022, 6:34pm

Couple of interested questions:

So this extends the build DAG onto the notion of pipelines?
Can it produce side effects on the repository (may be used as a task runner)?
Does it work on Mac only if instructions are multiplatform? (I’d assume so since nix build via runc isn’t yet a thing)
Is the use of the word “microservices” a marketing plug?
Is the yaml iface optional in case Nix is your lingua franca for config management?
Does it have an optional native interface to Nix builds? (if you want more than just isolated environments)

If some of those boxes hit true for me, I might be interested to invest time in bringing prime support to GitHub - divnix/std: A DevOps framework for the SDLC with the power of Nix and Flakes. Good for keeping deadlines!

Specifically: Add `std flow` · Issue #49 · divnix/std · GitHub

PS: I pretty much recognize dagger.io, here — a coincidence?

Some random ideas brainstorming:

dependencies:
    - ginkgo_1.16.5.nix

→

dependencies:
    - //cellname/organellname/target-package

For true (data-)pipelines, it is sometimes required to process the outputs of a previous task as the inputs of a subsequent one. I’ve been looking at nushell (side-benefit: cross platform shell) in the past. @zuzuleinen what’s your thought on this?
Is the remote cache planned to be a nix cache (at least as an option?). Would be nice if such a user friendly tooling would maintain (maybe even undocumented) escape hatches into a more nix-centric (as in “lingua franca”) world.
For context on why this is interesting for std: there is one possible adoption of std which postulates the 4 layers of packaging (package, entrypoint, OCI image, scheduler manifest + observability config). Since that schema can be highly standardized, any execution tooling (e.g. bob or a CI) can essentially be primed with a “preset” that already knows what to do (no need to define anything). Still, running that same thing locally through a nice CLI and combine it seamlessly with local dev env (via docker-compose) is pretty much a wide open known gap in its workflow.
docker-compose: have you considered foreman, hivemind & co. Or even something like nix-native, like aron as a possible type?

matnosner · August 28, 2022, 9:30pm

Hey David, thanks for your enthusiasm and helpful links.

(1) You can define an entire DAG with Bob’s tasks. So when speaking of pipelines we really mean a task-pipeline as you are used to from most CI systems.
(2) Yes, Intended!
(3) Yes, instructions must be multiplatform, though it depends a bit on the use case.
(4) No. Bob was developed out of the pain to build a polyglot microservice application which used code generation for the api’s. Bazel was no option due to its complexity. Sure the word “Microservices” is a coined term, but I believe that it is the kind of systems for which Bob provides the most benefits. Feel free to replace it with multi-service, distributed-monolith or even monorepo.
(5) The DAG is build from those tasks defined in a bob.yaml. If you statically link against bob you can directly populate the structs yourself. Our tests (https://github.com/benchkram/bob/blob/6403a1de16304256725a06bba1522489ed1ccb27/test/e2e/multilevelbuild/multilevelbuild_test.go#L36) & the playground can probably give you some inspiration here. For full peace of mind there might be some polishing necessary if using the programmatic interface.
(6) No. We use nix-build under the hood (https://github.com/benchkram/bob/blob/6403a1de16304256725a06bba1522489ed1ccb27/pkg/nix/nix.go#L79) to download the packages and get the correct paths to the packages in `/nix/store/. But we don’t allow you to alter the given flags. But this could easily be made more generic.

Dagger.io

Yes it’s kind of related to dagger.io or earthly.dev as the problem space we are trying to solve is similar. Constructing a DAG on top of build isolation. With the difference of them using buildkit under the hood.

Data Pipelines

Never looked into nushell. Bob is made to execute task-pipeline… though there is nothing to stop you from setting datapipelines up with Bob. Though the output → inputs from task to task must be shared through the filesystem. Caching and artifacts might be a performance problem as the target for each task is stored in a tarball.

Remote Cache

I have only looked very briefly into the little publicly available documentation on how cachix works. But I guess it’s technically possible. As of now it’s not planned as we are trying to solve problems for application developers.Though i will reconsider if i see a clear benefit.

Packaging

You’ve got my attention… that definitely overlaps with our long term vision. Though i have mostly experience with building container images through docker build which we use in some projects to create containers to run them using bob run (which can execute docker compose files & control the lifecycle). Though it might take some time to stabilize bob run.

Foreman, Hivemind

No, I wasn’t even aware of it. Checking….!

Thx for pointing out https://norouter.io in the repo, something like this looks very, very essential for having control over hybrid dev environments.

Does this answer your questions?

Cheers,
Matthias

blaggacao · August 29, 2022, 1:13am

Thanks for the detailed reply! I was actually just on the lookout for some code pointers…

I just tried to capture our current best practices (@ IOG / IOHK) in a publicly available pattern explainer:

github.com

divnix/std/blob/de39e5736392a7dc626265d5ea2be21f9ac71cf3/docs/patterns/four-packaging-layers.md

> _This is an opinionated pattern._
>
> _It helps structure working together on microservices with `std`._

# The 4 Layers of Packaging

## The Problem

We have written an application and now we want to package and run it.
For its supply chain security benefits, we have been advised to employ reproducible and source-based distribution mechanisms.
We furthermore need an interoperability platform that is generic and versatile: a _configuration "lingua franca"_.
Our peers who write another application in another language should share that same approach.
Thereby, we avoid the negative external effects of DevOps silos on integrators and operators, alike.
Short: we make adoption of our application as easy as possible for our consumers.

## The Actors

_Note, that each actor persona can be exercised by one and the same person or a group of persons.
Although possible, and even frequently so, it doesn't imply that these roles are necessarily taken by distinct individuals._

This file has been truncated. show original

The workings on how to obsolete CI description for all parts of the workflow except any “Probing & Attestation” stages (such as e2e testing, benchmarking, load testing, monkey testing, property testing, fuzzying, etc) is the following:

All involved organelles (packages.nix, entrypoints.nix, oci-images.nix & <shed>Charts.nix) are inherently typed and easily discoverable by a pipline tooling ootb (there is even already a hidden implementation of an after keyword in std that can by used to construe a DAG across cells, organelles and targets).

Each of these typed organelles (types are called esoterically “clades”) have well-known clade actions. For example, for clades.containers, there is copy-to-registry, or in full on the CLI: std //cell/oci-images/myimage:copy-to-registry.

Consequentially, for the core parts of building, testing up to publishing artifacts, the CI already could “know” what to do without any further instructions as a collateral side effect of code organization and nix-based (source-based) distribution.

So this is where I see a strong value of a “Bob preset for Standard” or something like that.

The beauty of Bob here is, that it can be made gradual: start with a bob.yaml and only adopt the stdized “4 Layers of Packaging” with some help from the “big boys” (in the subject of source-based distribution) in your organization… And it’ll be still bob run and feel magically the same. Just now, your downstream integrators also can experience the joy.

EDIT: in this context, please also have a look at the following draft doc on how to make OCI-image building a truly emergent property of the “entrypoint discussion” (between Devs & Ops).

github.com

divnix/std/blob/f64c112aa4c98f76663abe96ac27552920875f73/cells/std/lib/writeShellEntrypoint.md

# `writeShellEntrypoint`

... is a function to write Standard OCI-image entrypoints.

The function signature is as follows:

```nix
{
  # the installable that is wrapped by this entrypoint (re-exported)
  package,
  # the literal bash string of the entrypoint that will be wrapped
  entrypoint,
  # initialize environment variables with these defaults
  env ? {},
  # runtime installables that the entrypoint or liveness/readiness probe uses (re-exported)
  runtimeInputs ? [],
  # domain specific debugging utilities (re-exported)
  debugInputs ? [],
  # domain specific liveness probe literal bash fragment (re-exported)
  livenessProbe ? null,

This file has been truncated. show original

ParetoOptimalDev · August 29, 2022, 10:31pm

As an application developer i’m very interested in waiting less with remote caches.

blaggacao · August 30, 2022, 1:00am

Indeed. In the javascript world, nx (https://nx.dev) even tries to replicate the Nix Cache, for example.

(And maybe that name is also a shameless rip, who knows)

matnosner · August 30, 2022, 10:45am

Any specific usecase in mind?

Getting/Pushing artifacts from a remote is basically influenced by

(1) bandwith
(2) compression rate
(3) what, entire artifact or increment

(2) and (3) can be tuned for a specific use case.

Or did you mean rather the process of setting up a remote cache at all?

Solene · August 30, 2022, 10:53am

Write once. Build once.
Anywhere.

“Anywhere” as long as you are using Windows / Linux / MacOS ?

blaggacao · August 30, 2022, 12:56pm

ad (3): this is an interesting point and has two aspects:

Components & Dependencies are built and cached separately. This requires intimate knowledge of the built system. Nix builds do strive to achieve that already most of the time.
In that sense, Nix is a language neutral-generalization of javascript’s nx with a bad UX.
Within each build: calculate and transport the delta, only.

In my opinion, the first point has the 80%, the second has probably the 20%.

A naive adoption of the Nix cache would probably be doable with a compatible hash calculation and with using the reproducible NAR format as opposed to the non-reproducible TAR format.

nix-community/go-nix has a nar implementation.

That would go a long way for interoperability and probably give you a solid cache implementation
(in the domain of binary artifacts) for free.

It wouldn’t solve the first point above, though. That can only be solved in a language-neutral way by jumping the hurdle and nixify the build (or integrate with every possible language-specific tooling — an operator’s / integrator’s nightmare)

Are you challanging an marketing hook-up statement for its degree of correctness and oversimplification?!

EDIT: This might be especially interesting and give bob some interoperability based on ~~docker~~ OCI runtime spec (even better), while still satisfying Nix’ fundamental derivation build interface.

adisbladis · August 30, 2022, 1:17pm

EDIT: This might be especially interesting and give bob some interoperability based on docker, while still satisfying Nix’ fundamental derivation build interface

This is not based on Docker but on the OCI runtime spec, it’s one level below what users would typically interact with in the container stack.

Solene · August 30, 2022, 2:21pm

I’m too naive