You’re out of the loop if you don’t think m series chips with unified memory aren’t one of the best platforms for running local inference
They aren't. Apple Silicon is unusable for interactive prefill and decode speeds in agentic workflows and SOTA LLMs.
They aren't. Apple Silicon is unusable for interactive prefill and decode speeds in agentic workflows and SOTA LLMs.