The HPC infrastructure is not like you're used to using. It is very high bandwidth but latency is dependent on where your data lives. There's a lot more layers that complicate things and each layer has a very different I/O speed
https://extremecomputingtraining.anl.gov/sites/atpesc/files/...
Also how to handle the data can be very different. Just see how libraries like this work. They take advantage of those burst buffers and try to minimize what's being pulled from storage. Though there's a lot of memory management in the code people write to do all this complex stuff you need so that you aren't waiting around for disks... or worse... tape