Same as the failure of Itanium VLIW instructions: you don't actually want to force the decision of what is in the cache back to compile time, when the relevant information is better available at runtime.
Also, additional information on instructions costs instruction bandwidth and I-cache.
That seems correct, but it also doesn’t account for managed languages with runtimes like JavaScript or Java or .NET, which probably have a lot of interesting runtime info they could use to influence caching behavior. There’s an amount of “who caches the cacher” if you go down this path (who manages cache lines for the V8 native code that is in turn managing cache lines for jitted JavaScript code), but it still seems like there is opportunity there?
thats a strange statement. its certainly not black and white, but the compiler has explicit lifetime information, while the cache infrastructure is using heuristics. I worked on a project which supported region tags in the cache for compiler-directed allocation and it showed some decent gains (in simulation).
I guess this is one place where it seems possible to allow for compiler annotations without disabling the default heuristics so you could maybe get the best of both.
> you don't actually want to force the decision of what is in the cache back to compile time, when the relevant information is better available at runtime
That is very context-dependent. In high-performance code having explicit control over caches can be very beneficial. CUDA and similar give you that ability and it is used extensively.
Now, for general "I wrote some code and want the hardware to run it fast with little effort from my side", I agree that transparent caches are the way.