logoalt Hacker News

hipadev2311/09/20241 replyview on HN

Are you using s3 local cache? Do you have heavy writes? What type of s3 disk type, if any, are you using? (s3, s3_plain, s3_plain_rewritable)? Or are you just using the s3 functions.

Clickhouse is amazing but I still struggle getting it working efficiently on s3, especially writes.


Replies

broner11/09/2024

My workload is 100% read. Querying zstd parquet on s3 standard. Neither clickhouse nor duckdb has a great s3 driver, which is why smart people like https://www.boilingdata.com/ wrote their own. I compared a handful of queries and found DuckDB makes a lot of round trips and Clickhouse takes the opposite approach and just reads the entire parquet file.