I’m shocked that this hasn’t been a thing from the start. That seems like table stakes for automating repetitive tasks.
It has been a thing. In a single request, this same cache is reused for each forward pass.
It took a while for companies to start metering it and charging accordingly.
Also companies invested in hierarchical caches that allow longer term and cross cluster caching.
It has been a thing. In a single request, this same cache is reused for each forward pass.
It took a while for companies to start metering it and charging accordingly.
Also companies invested in hierarchical caches that allow longer term and cross cluster caching.