logoalt Hacker News

bityardyesterday at 5:19 PM1 replyview on HN

Interesting! I like the relative simplicity and durability guarantees. I can see using this for dev and proof of concept. Or in situations where HA/RAID are handled lower in the stack.

What is the performance like for reads, writes, and deletes?

And just to play devil's advocate: What would you say to someone who argues that you've essentially reimplemented a filesystem?


Replies

uroniyesterday at 5:53 PM

It uses LMDB, so if the object mapping fits in memory that should be pretty optimal for reading, while using the build-in Linux page cache and not a separate one (important for testing use cases). For write/deletes it has a bit of write-amplification due to the copy-on-write btree. I've implemented a separate, optional WAL for this and also a mode where writes/delete can be bundeled in a transaction, but in practice I think the performance difference should not matter.

W.r.t. filesystem: Yes, aware of this. Initially used minio and also implemented the use case directly on XFS as well and only had problems at larger scales (that still fit on a machine) with it. Ceph went into a similar direction with BlueStore (reimplementing the filesystem, but with RocksDB).