>Oh does llama.cpp use MLX or whatever? No. It runs on MacOS but uses Metal instead of MLX.

irusensei • today at 8:12 AM • 2 replies • view on HN

>Oh does llama.cpp use MLX or whatever?

No. It runs on MacOS but uses Metal instead of MLX.

ANE-powered inference (at least for prefill, which is a key bottleneck on pre-M5 platforms) is also in the works, per https://github.com/ggml-org/llama.cpp/issues/10453#issuecomm...

OkGoDoIt • today at 8:58 AM

Is that better or worse?

➕ show 1 reply

alt Hacker News