> There is no latency
Every chat bot I was ever forced to use has built-in latency, together with animated … to simulate a real user typing. It’s the worst of all worlds.
Because they are all using some cloud service and external LLM for that. We not.
We sell our users a strong server, where he has all his data and all his services. The LLM is local, and trained by us.
> to simulate a real user typing
The models return a realtime stream of tokens.