Conclusion: both are true which makes sense. The KV cache scaling yields both the emergent power and requires the enormous capacity.
Which does sort of hint at a (power/profitability) ceiling on the LLM line of AI… That should make the industry nervous.
Which does sort of hint at a (power/profitability) ceiling on the LLM line of AI… That should make the industry nervous.