logoalt Hacker News

kamranjonyesterday at 5:31 PM1 replyview on HN

You’re out of the loop if you don’t think m series chips with unified memory aren’t one of the best platforms for running local inference


Replies

bigyabaiyesterday at 5:39 PM

They aren't. Apple Silicon is unusable for interactive prefill and decode speeds in agentic workflows and SOTA LLMs.

show 1 reply