I think the real case is a future technology. Similar to speculative decoding but done over servers....

johnsmith1840 • yesterday at 8:39 PM • 0 replies • view on HN

I think the real case is a future technology. Similar to speculative decoding but done over servers.

Local model answers and reaches into the cloud for hard tokens.

alt Hacker News