logoalt Hacker News

kerblanglast Monday at 11:13 PM1 replyview on HN

Kafka is really not intended to improve on this. Instead, it's intended for very high-volume ETL processing, where a classical message queue delivering records would spend too much time on locking. Kafka is hot-rodding the message queue design and removing guard rails to get more messages thru faster.

Generally I say, "Message queues are for tasks, Kafka is for data." But in the latter case, if your data volume is not huge, a message queue for async ETL will do just fine and give better guarantees as FIFO goes.

In essence, Kafka is a very specialized version of much more general-purpose message queues, which should be your default starting point. It's similar to replacing a SQL RDBMS with some kind of special NoSQL system - if you need it, okay, but otherwise the general-purpose default is usually the better option.


Replies

CharlieDigitallast Monday at 11:31 PM

Of course this is not the same as Kafka, but the comment I'm replying to:

    > Ah yes, and every consumer should just do this in a while (true) loop as producers write to it. Very efficient and simple with no possibility of lock contention or hot spots. Genius, really.
Seemed to imply that it's not possible to build a high performance pub/sub system using a simple SQL select. I do not think that is true and it is in fact fairly easy to build a high performance pub/sub system with a simple SQL select. Clearly, this design as proposed is not the same as Kafka.
show 1 reply