Any opinions on DuckLake?
I’ve had very good experience with it last year. I used it at large scale with data that had been in iceberg previously and it worked flawlessly. It’s only improved since. Highly recommend.
With my enterprise hat on, I'd say Athena + S3 is good enough. Only use DuckDB for ad hoc analysis.
Seems stable enough, they patched a bunch of things.
The problem space that ducklake solves is smaller, but it helped me to get a working metabase dashboard quickly on ~1tb of data with 128gb ram. Queries were much, much faster than all alternatives.
Some downsides are: No unique constraints with indexes (can accidentally shoot yourself in the foot with double ingestion), writing is a bit cumbersome if you already have parquet files.