logoalt Hacker News

hantusklast Thursday at 7:12 AM2 repliesview on HN

I agree. So many disparate solutions. The streaming sql primitives are by themselves good enough (e.g. `tumble`, `hop` or `session` windows), but the infrastructural components are always rough in real life use cases.

crossing fingers for solutions like `https://github.com/feldera/feldera` to be wrapped in a nice database, `https://materialize.com/` to solve their memory issues, or `https://clickhouse.com/docs/en/materialized-view` to solve reliable streaming consumption.

Various streaming processing frameworks often have domain specific languages with a lot of limitations of how to express aggregations and transformations.


Replies

def-last Thursday at 9:28 AM

> [...] `https://materialize.com/` to solve their memory issues [...]

Disclaimer: I work at Materialize

Recently there have been major improvements in Materialize's memory usage as well as using disk to swap out some data.

I find it pretty easy to hook up to Postgres/MySQL/Kafka instances: https://materialize.com/blog/materialize-emulator/

knuckleheadslast Thursday at 7:31 AM

Yeah I have a feeling something like polars for streaming would be super popular and useful, but it just hasn't happened yet. It's much easier to just do say kafka and a long running python script and write out the transformations by hand, than it is to use anything on the market right now. None of the current streaming processors want to be embedded as far as I can tell, that's not where the money is. They all want to be paid to run it in the cloud for you and follow that vc playbook model. Which, fair! I do think there's a lot of space out that isn't being occupied though and I hope somebody tries to fill it soon.

(As an aside, feldera doesn't want to be embedded into your app, materialize either, and clickhouse might just pull a great streaming library out from nowhere, they seem to be good at just doing stuff like that).