logoalt Hacker News

andsoitisyesterday at 8:29 PM4 repliesview on HN

Why doesn’t Apple?


Replies

tpurvesyesterday at 8:45 PM

Like with all new tech trends, it takes them a hot minute to catch up, but it's highly likely they will (eventually) release some killer platforms for local AI. The shared memory, high bandwidth and power-efficiency of their M chips is a near-ideal architecture. If/when they finally push out the M5-ultra, that could be round one (albeit still not at the best price/performace vs comparable cloud api tokens). A real mass-market killer device for local LLMs is still going to require some remediation of the global DRAM shortages, and maybe the M6/M7 generation.

dvtyesterday at 8:53 PM

Apple has Metal, which is already pretty well-integrated in llama.cpp, various Python libs, and mistral-rs & candle. Unpopular opinion, but Vulkan is hot garbage and the definition of "design by committee." There's a reason people still prefer CUDA, whereas most code could likely be programmatically ported anyway.

coldteayesterday at 9:06 PM

Vulkan is not Apple.

Metal is Apple's API.

hansonkdyesterday at 8:36 PM

After the steep increase in sales of Mac Studios specifically for LLMs, I'm waiting for Apple to release a frontier level model, optimized for highest end of apple hardware (probably would be hardware locked by a certain neural processor needed (which would then lock the memory config).

The built in Apple Intelligence right now is very small, but even just having a small LLM you know is always there, online, fast and ready makes you think about building app differently. I would love the context to expand from the meager ~4K tokens.