actually I think it sort of was, I remember berkeley squeezing a ton of perf out of their cray for a crazy task because it was easy to specialize some wild semi-sparse matrix computations onto an architecture with strange memory/cache bottlenecks, while being guaranteed that the results are still okay.