One of the choke points of all modern video codecs that focus on potential high compression ratios is the arithmetic entropy coding. CABAC for h264 and h265, 16-symbol arithmetic coding for AV1. There is no way to parallelize that AFAIK: the next symbol depends on the previous one. All you can do is a bit of speculative decoding but that doesn’t go very deep. Even when implemented in hardware, the arithmetic decoding is hard to parallelize.
This is especially a choke point when you use these codecs for high quality settings. The prediction and filtering steps later in the decoding pipeline are relatively easy to parallelize.
High throughput CODECs like ProRes don’t use arithmetic coding but a much simpler, table based, coding scheme.
FFv1's range coder has higher complexity than CABAC. The issue is serialization. Mainstream codecs require that the a block depends on previously decoded blocks. Tiles exist, but they're so much larger, and so rarely used, that they may as well not exist.