Declarative insecure self-signed certificates in NixOS

I use a mesh VPN to connect my devices. Each device gets an IP address in the VPN’s IP range, and the mesh VPN cryptographically guarantees that any packet to/from an IP address will only go to/from the respective device. I use my hosts file to associate these IP addresses with human-readable hostnames (laptop, desktop, server, etc.).

This has worked very well and let me avoid dependence on centralized cloud infrastructure. However, I am starting to run into more and more browser annoyances which arise from lack of TLS support. There is currently no browser which supports unencrypted HTTP/2, and the HTTP3/QUIC specs completely lack support for unencrypted connections; additionally, Firefox has an annoying “insecure site” warning popup that interferes with my password manager and is no longer possible to disable.

I don’t need any of the TLS security guarantees. How are other people solving this problem? Is it possible to install a self-signed certificate that can only be used for a hardcoded set of domains, to trick Firefox into enabling the features it is locking behind TLS?

My security model considers my Nix configuration to be public (and this certificate would need to be installed on less-trusted edge devices), so I want to avoid having this certificate be able to spoof any domain outside of my intranet.

I don’t know if this helps, but instead of self-signed in my homelab I use step-ca, trust the root CA for it on all of my machines, limit what domains it can issue certificates for (much safer that way, even if someone steals your CA’s private key, they can not impresonate google.com), and then both Caddy (my reverse proxy) and Teleport request certificates from step-ca automagically using ACME.

1 Like

Thanks for that link! The IP range for my VPN is 10.144.0.0/16, so my plan is to generate a CA that can sign certificates for all IP’s in that range (as well as local IPs):

openssl req -x509 -nodes -days 365000 -newkey rsa:4096 -keyout custom.ca.key -out custom.ca.crt \
   -addext "basicConstraints = critical, CA:true, pathlen:0" \
   -addext "subjectKeyIdentifier = hash" \
   -addext "keyUsage = critical, keyCertSign, cRLSign" \
   -addext "nameConstraints = critical, permitted;IP:10.144.0.0/0.0.225.225, permitted;IP:127.0.0.1/0.0.0.225"

Then I will publish both custom.ca.crt and custom.ca.key as part of my Nix config, since I don’t care if an attacker can spoof localhost or any node on my VPN (both of these are already authenticated by other means). custom.ca.crt will be added to the certificate store on all of my devices.

Finally, I will write a derivation for each device that signs a certificate for its hostname using custom.ca.key. The output of this derivation can be fed to nginx.

Are there any obvious security problems with this plan? I know that I am forgoing the security benefits of TLS by publishing the private key for the root CA, but I would be perfectly happy if I am no less secure than I currently am by using http over my secure tunnels.

I wouldn’t do that, but that’s because I already have sops-nix set up, so private key (or rather the passphrase to decrypt it) goes into sops. But I guess it is required in your case to have the key available in your Nix store, as you want to use it in derivations.

Mesh VPN I understand, but how does localhost get authenticated in your case?

Not a security expert, so I will refrain from making any strong statement one way or another!

1 Like

I wouldn’t do that, but that’s because I already have sops-nix set up, so private key (or rather the passphrase to decrypt it) goes into sops. But I guess it is required in your case to have the key available in your Nix store, as you want to use it in derivations.

Yeah, I handle secrets as state. My hope is that by doing it this way I don’t need to create any more secrets I need to handle!

Mesh VPN I understand, but how does localhost get authenticated in your case?

The same way it always does—127.0.0.x goes to my loopback device which only local processes can listen on. I’m fine with local processes being able to MITM local services (which is why I use http://localhost, like most people).

Not a security expert, so I will refrain from making any strong statement one way or another!

Me neither. I am hoping one will see this and point out any problems if they are obvious!

Okay, an update. I mostly got this working, using .vpn as my internal TLD. I generated a root certificate to put in my Nix configuration with

openssl req -x509 -nodes -days 365000 -newkey rsa:4096 -noenc -keyout waltmck-cakey.pem -out waltmck-cacrt.pem \
   -addext "basicConstraints = critical, CA:true, pathlen:0" \
   -addext "subjectKeyIdentifier = hash" \
   -addext "keyUsage = critical, keyCertSign, cRLSign" \
   -addext "nameConstraints = critical, permitted;DNS:.vpn, permitted;IP:10.144.0.0/0.0.225.225, permitted;IP:127.0.0.1/0.0.0.225"

and trusted it with security.pki.certificateFiles = [./waltmck-cacrt.pem];.

Then, I wrote a derivation for each of my hosts that basically that basically just runs

openssl req -x509 -subj "/CN=${hostname}.vpn" -days 365000 -noenc \
    -CA ${./waltmck-cacrt.pem} -CAkey ${./waltmck-cakey.pem} -extensions usr_cert \
    -out $out/server.crt -keyout $out/server.key -newkey rsa:4096 \
    -addext "subjectAltName = DNS:${hostname}.vpn"

and used the result to set sslCertificate and sslCertificate in my nginx config.

This setup mostly works, but I would like to be able to connect to https://hostname rather than https://hostname.vpn. Unfortunately, using aliases in /etc/hosts or setting services.resolved.domains = ["vpn"]; yields SSL_ERROR_BAD_CERT_DOMAIN (and my root certificate correctly does not have the ability to sign certificates for domains that don’t end in .vpn). Is there a way to set the equivalent of a CNAME locally with systemd-resolved?

1 Like

This is a bad idea. The key will end up world-readable in the nix store. If you build off-host nix will even send it across the internet.

You should use something along the lines of sops-nix, agenix, vars or even just imperative systemd credentials to expose these paths to your NixOS configuration without putting the files in the store.

As you say:

NixOS should never be involved in managing state, though NixOS can set up services that manage state. The above projects are all services designed to handle secrets as state.

1 Like

This is a bad idea. The key will end up world-readable in the nix store. If you build off-host nix will even send it across the internet.

The whole idea of using nameConstraints in the way that I did is that the root key is not a secret. Under my security model it is already impossible to impersonate IPs in my VPN’s address range, since authentication is done by my VPN. I understand that leaking the root key would allow an attacker to impersonate *.vpn or localhost, but I was already fine just not using TLS for those anyways: this is no less safe than that alternative, and avoids introducing additional complexity/state management for no security gain.

1 Like

Ah. Right. Fair enough. I can’t help but think that there will be issues on some level here anyway, given that you’re subverting the basic principles of web security. https is all but universally required, and hence relied on; it’s possible these certificates end up being used in ways you don’t expect beyond simple transport encryption.

It’s your threat model, but security is one of those things where going against the flow is generally a bad idea IME. I’d really recommend not just throwing caution to the wind; especially because sops & co. really are not that much extra overhead.

Worst case you could set up a DNS server to do that, but it doesn’t matter, you’d need a certificate for the domain you’re redirecting from anyway, even with a CNAME record. There’s no way to do this without claiming those hostnames in your certificate.

That said, surely you could just do that? Do browsers reject certificates where the CA doesn’t limit itself to a TLD these days? AIUI that is an extension and it should be perfectly possible to just claim random names and have that be signed by your CA, as long as you trust that cert (and you don’t explicitly limit your CA like you do now) browsers should have no issue with this.

I.e., drop the permitted;DNS:.vpn in your CA, replace it with specific permitted claims for all your hostnames, and add subjectAltName = DNS:${hostname} to your certs (aside: are spaces allowed in there?). Maybe claim the CN for that hostname and drop the .vpn altogether.

As an aside, if you can accept a shared TLD of some sort, you could set up mDNS and use .local. This would save you the maintenance effort of hard-coded hostnames, and seems more appropriate for this kind of mesh VPN than raw hostnames IMO, I don’t think saving 6 characters is worth all the subverting networking standards you’re doing.

Of course, you’d still need to sign any new hostnames, so you’d need a way to distribute your CA cert, and that in turn probably is best done with your NixOS config, so maybe mDNS doesn’t actually do all that much for you. Still, food for thought.

I can’t help but think that there will be issues on some level here anyway, given that you’re subverting the basic principles of web security. https is all but universally required, and hence relied on; it’s possible these certificates end up being used in ways you don’t expect beyond simple transport encryption.

Finding good documentation on X.509 has been very difficult, but it seems that there is no good way to get the performance benefits of HTTP/2 and HTTP/3 over a secure internal connection without making some serious compromises. You introduce a totally unnecessary maintenance burden by paying for a domain name and using LetsEncrypt, or you must install certificates on your devices and take responsibility for protecting a secret (I suspect that this isn’t an accident, since the designers of X.509 had every incentive to further contribute to the centralization of the web). At least by using nameConstraints in this way I am avoiding opening up additional security holes by allowing someone who takes my secret key to impersonate google.com on any of my devices: judging by my research online, that is more than most people do.

If what I am describing does work, I think that it is the cleanest solution to the problem that avoids taking responsibility for any secrets. That being said, I’d really appreciate it if anyone finds any vulnerabilities or issues!

It’s your threat model, but security is one of those things where going against the flow is generally a bad idea IME. I’d really recommend not just throwing caution to the wind; especially because sops & co. really are not that much extra overhead.

I’m quite cautious and I have given serious consideration to all of the different options here, but I don’t think that SOPS would quite solve my problem. I have devices in my network which I do not entirely trust (i.e. VPS servers with public-facing services and IoT devices), and if one of these were compromised then waltmck-cakey.pem would be readable. If I were to actually treat it as a secret, I would need to store it as state on some trusted device I control and manually sign/deploy leaf certs to new devices as I bring them online. This would be more maintenance burden than is justified by the benefits of using TLS.

That said, surely you could just do that? Do browsers reject certificates where the CA doesn’t limit itself to a TLD these days? AIUI that is an extension and it should be perfectly possible to just claim random names and have that be signed by your CA, as long as you trust that cert (and you don’t explicitly limit your CA like you do now) browsers should have no issue with this.

I could create and trust a CA that doesn’t limit itself to a TLD. However, that is a massive security hole: anyone who steals it could spoof google.com to any of my devices. Even if I restricted it to my VPN’s IP range, it would still be a bad idea: then anyone who compromises one of my devices and the root cert could spoof google.com to point to the stolen device and MITM from there.

drop the permitted;DNS:.vpn in your CA, replace it with specific permitted claims for all your hostnames, and add subjectAltName = DNS:${hostname} to your certs (aside: are spaces allowed in there?). Maybe claim the CN for that hostname and drop the .vpn altogether.

This is the method that I originally considered. The reason I gave up on it is because it means that every time I add a new device to my network, I would have to manually generate a new root cert and re-sign all of my devices’ leaf certs. This would actually still be acceptable if it could be done programmatically (i.e. I write a derivation for a root cert given a list of hostnames), but that would require the root certificate generation be reproducible (so that the same root cert is trusted by all of my devices). I explored that pathway a little bit, but it ended up being too cursed even for me.

As an aside, if you can accept a shared TLD of some sort, you could set up mDNS and use .local. This would save you the maintenance effort of hard-coded hostnames, and seems more appropriate for this kind of mesh VPN than raw hostnames IMO, I don’t think saving 6 characters is worth all the subverting networking standards you’re doing.

Yeah, mDNS is great, and theat is what I did before switching to NixOS. But if I already know the IP ↔ hostname mapping ahead of time, why introduce an additional point of failure?

That’s the point; sops uses device-specific encryption, it would only be readable on devices that should be able to read it - or none, I suppose, if you really wanted to store a dev-only secret.

If you manage your secrets as data and not configuration this is quite a bit less problematic. I would really suggest looking into a flow using sops and a simple script you call anytime you want a new host, rather than using derivations for things that should never be derivations.

1 Like