I doubt your experience of local models would be of lower latency, except for quite small models in ...

dofm • today at 8:06 AM • 0 replies • view on HN

I doubt your experience of local models would be of lower latency, except for quite small models in edge uses.

In every way, the cloud products from the big two seem optimised for speed and speed of initial response even.

I don’t think most people are running local models for speed. More for control, privacy, interest, bloody-mindedness and general principle.

alt Hacker News