I don't wanna see that S3 bandwidth bill after running some big query
There are self hosted object stores which use the same protocol as S3. One example: https://github.com/minio/minio
Parquet files are smaller than row based storage in a database (but not those databases with focus on strong compression).
And for backup - the files are probably easier to just copy to multiple disks for redundancy, as opposed to database dumps and incremental backups which at the Petabyte scale will be a pain.
There’s no S3 bandwidth bill for traffic to and from EC2 in the same region.