logoalt Hacker News

jcgrillotoday at 2:21 AM0 repliesview on HN

Yeah despite the conceptual statelessness, there is quite a bit of state that hangs around though--KV cache and context. I still haven't been able to find anything concrete in docs about how these are isolated. In any case it's clearly a different class of issue than the one from the article. Not endemic to how LLMs work, just normal web session stuff, modulo some GPU memory handling.