Using and reconfiguring NixOS offline, how realistic?

Vic28 · October 2, 2024, 11:10am

Hi,

I have been learning Nix for a month or so and really enjoy it.

I am considering migrating my laptops and servers to NixOS but there is potentially one blocking point:
I need to be able to use and potentially change some configuration during potentially long (days, or even weeks) offline periods.
We are talking about providing basic services such as NFS, NextCloud, Home Assistant, Prometheus, Syncthing, Gitea, ArchiveBox, etc.

From the information I have gathered those are the items I need to to anticipate those offline periods:

Keep offline copies of key documentation (manual, NixOS wiki)
Setup a binary cache Binary Cache - NixOS Wiki
Keep outputs and derivations

  nix.extraOptions = ''
    keep-outputs = true
    keep-derivations = true
  '';

Use flakes as much as possible as apparently they do a better job at caching
While offline make use of --no-build-nix if there is a problem to disables building a new Nix

My questions are:

Do you see any other ways I can prepare?
From your experience, how realistic is having to deal with NixOS offline?

Thank you so much

TLATER · October 2, 2024, 12:24pm

To be clear, you’re not talking about an air-gapped solution where technically there is a way to push data acquired from the internet to the devices, right?

I ask because the use cases you list seem to inherently require at least some networking and seem too general purpose to require anything bespoke, especially being offline. This may well be an XY problem.

The caching flakes do is evaluation caching and has nothing to do with being online, just how quickly the actual nix code evaluates.

Channels also don’t randomly update without explicit flags, so as far as nix store contents are concerned you should not need to be worried about this.

The flakes concept does have other pretty significant benefits though. They also enforce pure evaluation, which if you don’t do correctly may in fact cause your sources to try and update every 2 hours, so I guess they do help a little there.

To be clear, I’ve not done this much, just wanted to get the above reverse questions out.

My two cents are that your list seems pretty reasonable, and:

Whether it is practical will depend a bit on your exact use case. While servers are offline there is probably not going to be much need for ongoing maintenance, since you should be shielded from most potential needs for updates and whatnot anyway. Any maintenance needs that do occur will be with existing software, so there should be few cache misses.

That said, Nix/OS is definitely not designed for this use case. There’s no way to ensure that you have everything you might need related to a certain module, since they may involve scripts that call out to any random binary. You might flip on some option to discover that it uses jq for something, and then not be able to use that option until you can reach the upstream cache again, for example.

In other words, it’ll depend on the kinds of software you deploy and what maintenance actually happens during offline periods. It’s hard to infer how your use case will go from others’, so I’d say give it a pilot run and see what works and what doesn’t.

For the record, people are running NixOS in space, so it’s not like this is unprecedented. Their solution ends up being quite a bit more bespoke, but the tools to achieve things like this do exist in the ecosystem.

waffle8946 · October 2, 2024, 2:14pm

As TLATER mentioned this is eval caching not derivation caching, and this caching is meant to counteract how noticeably slow eval is with flakes. In practice, flakes are often slower than non-flakes, especially since nix 2.19+ (and performance is not the main reason to use flakes anyway). The enforced eval-time purity can increase the rate of cache hits (i.e. it prevents you from being surprised by the inputs), which is probably what you’d benefit from - just ensure to never use --impure.

Besides that, I don’t think I quite understand your usecase:

Do you plan to simply tweak some existing config files, or do you plan to add new services/programs/packages?
A central binary cache implies there is still some networking between the various machines and said cache; is there going to be networking available?
Are these offline periods planned or unplanned?

Vic28 · October 2, 2024, 8:37pm

Thank you both for all the insights

Yes, I am just talking about a scenario where the Internet is suddenly gone (so unplanned) and I need to do basic changes in my system and services configuration. Most likely changes in networking.

And yes local network will still be available…

Yes this is what I noticed while doing basic testing, it is actually hard to predict what would be suddenly needed even though I am not trying to explicitly add new software.
I feel Nix is simply not designed for this use case.

Now I was just thinking I might be overthinking it:
In case I am offline and I am in a “break the glass in case emergency” situation. Isn’t the most straightforward way to proceed is to find a way to make the nix store writable or change the symlink, and directly change the configuration there?

TLATER · October 2, 2024, 9:07pm

If something isn’t possible offline, that means you are missing some software, and fundamentally need to download it. Manually editing the nix store would not help with this, and is an excellent way to break your systems completely.

In an emergency in which changing files without modules somehow helps, you should instead use your config to write files to the nix store with pkgs.writeText & co., and disable modules and their config files, replacing them with your hand-written ones. I think this will very rarely actually help with problems caused by being offline though.

If your concern is primarily just unscheduled, long ISP outages for your homelab, you’ll probably find that after running your systems for a few months you will almost never need to make changes - at least while offline - for the use cases you list, so this problem will probably not be very significant after a while.