Show HN: DuckDB for Kafka Stream Processing

64 points • by dm03514 • yesterday at 5:25 PM • 13 comments • view on HN

Hello Everyone! We built SQLFlow as a lightweight stream processing engine.

We leverage DuckDB as the stream processing engine, which gives SQLFlow the ability to process 10's of thousands of messages a second using ~250MiB of memory!

DuckDB also supports a rich ecosystem of sinks and connectors!

https://sql-flow.com/docs/category/tutorials/

https://github.com/turbolytics/sql-flow

We were tired of running JVM's for simple stream processing, and also of bespoke one off stream processors

I would love your feedback, criticisms and/or experiences!

Thank you

Comments

pulkitsh1234 • yesterday at 8:01 PM

(not an expert in stream processing).. from the docs here https://sql-flow.com/docs/introduction/basics#output-sink it seems like this works on "batches" of data, how is this different from batch processing ? Where is the "stream" here ?

➕ show 1 reply

srameshc • yesterday at 6:24 PM

This looks brilliant, thank you. I love DuckDB and use it for lot of local data processing jobs. We have a data stream, not to the size where we need to push to BigQuery or elsewhere. I was thinking of trying something like sql-flow but I am glad now it makes the job very easy.

mihevc • yesterday at 6:38 PM

How does this compare to https://github.com/Query-farm/tributary ?

➕ show 2 replies

mbay • yesterday at 6:32 PM

I see an example with what looks like a lookup-type join against a Postgres DB. Are stream/stream joins supported, though?

The DLQ and Prometheus integration out of the box are nice.

➕ show 1 reply

itsfseven • yesterday at 7:08 PM

It would be great if this supported Pulsar too!

alt Hacker News

Show HN: DuckDB for Kafka Stream Processing

Comments