logoalt Hacker News

dist-epochtoday at 10:20 AM1 replyview on HN

Can anybody explain why many times if you switch away from the chat app on the phone, the conversation can get broken?

Having long living requests, where you submit one, you get back a request_id, and then you can poll for it's status is a 20 year old solved problem.

Why is this such a difficult thing to do in practice for chat apps? Do we need ASI to solve this problem?


Replies

zknilltoday at 11:13 AM

I suspect the answer is that the AI chat-app is built so that the LLM response tokens are sent straight into the HTTP response as a SSE stream, without being stored (in their intermediate state) in a database. BUT the 'full' response _is_ stored in the database once the LLM stream is complete, just not the intermediate tokens.

If you look at the gifs of the Claude UI in this post[1], you can see how the HTTP response is broken on page refresh, but some time later the full response is available again because it's now being served 'in full' from the database.

[1]: https://zknill.io/posts/chatbots-worst-enemy-is-page-refresh...