Do you think ed2k would actually scale nixpkgs-scale? I mean, I have
currently ~4M files in my store, not even counting that they should be
further split into chunks to increase parallelism, and I have done a
garbage-collect relatively recently… I guess we’re hitting the billion
files if we include even just a year worth of nixpkgs bump, and with
many people on the network we’d likely get at least that.
I think we should index store paths. I have a mere 1.86M files right
now, but this corresponds to just 27k files. This sounds fine for e2dk
search performance if my memory serves me well (in the model of every
person sharing some amount of random stuff collected over years).
We could also compare large Bittorrent trackers with Hydra from the POV
of daily entry flow…
Not the same one. This protocol, by the way, has an obvious MITM attack
which is cheap — that is why I said that a secure two-party random
string generation is needed, we need both client and server to be able
to ensure that nonce is good.Well… I assumed that the “A secure connection is established between
the user and the peer” was a standard TLS-or-similar protocol, that
already checks that both client and server agree on the private key,
which is just as good as agreeing on the nonce.However, the MitM you describe is indeed an issue, and it isn’t
prevented by making sure client and server agree on the nonce, as the
attacker could just relay the nonce-agreement messages.A solution to this may be to use the DH-derived encryption key to derive
the nonce, because with this the attacker can’t be MitM’ing (because if
they’re MitM’ing, then the DH-derived key wouldn’t be the same).
DH is expensive, I would just do a hash-based commitment to personal
nonces, then reveal the committed values, then xor.
Also, maybe something from the zero-knowledge proof domain could help
here? I’m not familiar with it, thoughWell, there is a secure multi-party protocol that reveals only the
desired outputs. But we have a lot of paths to request, and a lot of
served paths, so it will be capital-E Expensive.I am not sure how to do such an oblivious search in a reasonably secure
and efficient way. Although I guess I could ask a few people to see if
the current state of the art is indeed better nowadays (but I have
doubts it scales well).Well… if they don’t answer I can ask people from my side, just tell me
if that’s required
Well, I guess I was too pessimistic: by just looking at IACR preprint
server one can find out that the so-called «Private set intersection» is
enough at least in the case of one server and one client…