All Apple devices have a NPU which is potentially able to save power for compute bound operations li...

zozbot234 • yesterday at 3:05 PM • 1 reply • view on HN

All Apple devices have a NPU which is potentially able to save power for compute bound operations like prefill (at least if you're ok with FP16 FMA/INT8 MADD arithmetic). It's just a matter of hooking up support to the main local AI frameworks. This is not a speedup per se but gives you more headroom wrt. power and thermals for everything else, so should yield higher performance overall.

Replies

d3k • yesterday at 4:24 PM

AFAIK, only CoreML can use Apple's NPU (ANE). Pytorch, MLX and the other kids on the block use MPS (the GPU). I think the limitations you mentioned relate to that (but I might be missing something)

alt Hacker News

Replies