logoalt Hacker News

flux3125today at 5:48 PM0 repliesview on HN

I imagine how advantageous it would be to have something like llama.cpp encoded on a chip instead, allowing us to run more than a single model. It would be slower than Jimmy, for sure, but depending on the speed, it could be an acceptable trade-off.