Dreaming about distributed file system

SeaweedFS ?
Reddit NixOs
GitHub
Issues

1 Like

If you want something that’s a million times nicer than Ceph and also supports geo-replication out of the box (whilst only supporting S3) - check out Garage: https://garagehq.deuxfleurs.fr/

I had heard of garage, but thanks for the reminder. I think what they are doing is super cool, but it is absolutely not a Ceph replacement for anything other than basic object storage (i.e. weakly consistent object store, no read your own writes), and so I think calling it nicer than Ceph with no qualifiers is a bit unfair :slight_smile: . For instance it is completely feasible with decent networking for Ceph to provide block storage devices and then run a database with separated compute / storage atop. This is a non-goal for garage, which is fine, they are trying to build a different system.

1 Like

Actually, it does implement read-after-write consistency: Configuration file format | Garage HQ

For me, it’s nicer than Ceph in that you can get it going in a matter of seconds on basic hardware as opposed to requiring 10GE networking.

It seems to me like a good seaweed implementation needs at least 3 master servers with low latency + different failure domains to keep it’s promises (because you have a master with volume mappings replicated with RAFT). This is before you even bother adding any volume servers (although maybe you just add some big disks to your masters and collocate the services).

Still seems like a bunch of hassle vs some weakly consistent p2p sync system. I guess if you enjoy this kind of thing go for it? If you want to know why all of these distributed file systems all seem to have such baroque requirements and you have a bunch of free time and enjoy CS papers give this a read. It doesn’t require advanced mathematics, just a great deal of patience to work through various voting scenarios.

You could always run the seaweed master on a single machine, and back it up regularly? If volume mappings are “stable enough” this could provide something, but just bear in mind any new volumes created between backups would need to be manually recovered (assuming that seaweedfs has the tools for this, I haven’t read enough to answer that).

Shame on me for skimming! That’s super cool. But Ceph doesn’t require 10GBe for object replication, you only need the really nice networking gear if you want to provide an FS or block device. That said: configuring and running ceph is horrible, and I wouldn’t wish it on anyone :slight_smile:

This is mentioned under non-goals in the goals page.

POSIX/Filesystem compatibility: we do not aim at being POSIX compatible or to emulate any 
kind of filesystem. Indeed, in a distributed environment, such synchronizations are translated 
in network messages that impose severe constraints on the deployment.

Does that mean I can’t use it as a filesystem?

Correct. Garage is an object store, not an FS. You can use it as a backup target, or for systems that target object stores.

1 Like

The Quobyte site have a lot of good Explanations.