We just create mini data "ponds" on the fly by copying tenant isolated gold tier data to p...

cjonas • today at 1:09 PM • 4 replies • view on HN

We just create mini data "ponds" on the fly by copying tenant isolated gold tier data to parquet in s3. The users/agent queries are executed with duckdb. We run this process when the user start a session and generate an STS token scoped to their tenant bucket path. Its extremely simple and works well (at least with our data volumes).

Replies

mattaitken • today at 2:17 PM

This is cool. I think for our use case this wouldn’t work. We’re dealing with billions of rows for some tenants.

We’re about to introduce alerts where users can write their own TRQL queries and then define alerts from them. Which requires evaluating them regularly so effectively the data needs to be continuously up to date.

➕ show 2 replies

Waterluvian • today at 1:20 PM

Is that why it’s called DuckDb? Because data ponds?

➕ show 3 replies

otterley • today at 2:18 PM

How large are these data volumes? How long does it take to prepare the data when a customer request comes in?

➕ show 1 reply

boundlessdreamz • today at 1:26 PM

How do you copy all the relevant data? Doesn't this create unnecessary load on your source DB?

➕ show 1 reply

alt Hacker News

Replies