I assume those don't just work automatically with an off-the-shelf gguf. What do you need in yo...

mortenjorck • last Monday at 9:17 PM • 2 replies • view on HN

I assume those don't just work automatically with an off-the-shelf gguf. What do you need in your local inference stack to take advantage of M5's neural accelerators?

Replies

wren6991 • yesterday at 7:31 AM

Apple muddied the waters by calling them "neural accelerators" but it seems like what they actually added in the M5 generation is tensor instructions for the existing GPU cores. It's not a separate accelerator like the ANE.

llama.cpp's Metal backend does use them when they're available.

aurareturn • last Monday at 9:21 PM

They do work with llama.cpp and MLX automatically.

alt Hacker News

Replies