It's all implemented on the CPU, yes, there's no GPU acceleration whatsoever (at the momen...

littlestymaar • 10/11/2024 • 0 replies • view on HN

It's all implemented on the CPU, yes, there's no GPU acceleration whatsoever (at the moment at least).

> if I have a good GPU, I should look for alternatives.

If you actually want to run it, even just on the CPU, you should look for an alternative (and the alternative is called llama.cpp) this is more of an educational resource about how things work when you remove all the layers of complexity in the ecosystem.

LLM are somewhat magic in how effective they can be, but in terms of code it's really simple.

alt Hacker News