Concurrency?
Suppose some some parallelized, distributed task requires 700GB of memory (I don't know if it does or does not) per node to accomplish, and that speed is a concern.
A singular pile of memory that is 700GB is insufficient not because it lacks capacity, but instead because it lacks scalability. That pile is only enough for 1 node.
If more nodes were added to increase speed but they all used that same single 700GB pile, then RAM bandwidth (and latency) gets in the way.