logoalt Hacker News

dofmtoday at 8:06 AM0 repliesview on HN

I doubt your experience of local models would be of lower latency, except for quite small models in edge uses.

In every way, the cloud products from the big two seem optimised for speed and speed of initial response even.

I don’t think most people are running local models for speed. More for control, privacy, interest, bloody-mindedness and general principle.