logoalt Hacker News

Quack-Cluster: A Serverless Distributed SQL Query Engine with DuckDB and Ray

30 pointsby tanelpoderlast Tuesday at 12:24 AM4 commentsview on HN

Comments

fodkodrasztoday at 5:18 PM

So DuckDB was developed to allow queries for bigish data finally without the need for a cluster to simplify data analysis... and we now put it to a cluster?

I think there are solutions for that scale of data already, and simplicity is the best feature of DuckDB (at lest for me).

mgaunardtoday at 4:39 PM

In my experience ray clusters don't scale well and end up costing you more money. You need to run permanent per-user instances etc.

What you need is a multi-tenancy shared infrastructure that is elastic.

dogman123today at 4:32 PM

neat. i'm pretty novice in the guts of this kind of stuff, but how does this work under the hood for blocking operators where they "cannot output a single row until the last row of their input has been seen"?

i think this is where spark shuffling comes in? but how does it work here.

https://duckdb.org/docs/stable/guides/performance/how_to_tun...

nevalainentoday at 4:48 PM

feels like a missed opportunity to call it cluster-quack xD

show 1 reply