duckdb is primarily a query engine. It does have a storage format, but one of it's strengths is querying data where it already resides (e.g. a parquet file sitting in S3).
There are some examples[0] of enabling DuckDB to manage distributed workloads, but these are pretty experimental.
Thanks for the pointers!