logoalt Hacker News

simonw10/11/20241 replyview on HN

Nice, just tried that with "tell me a long tall tale" as the prompt and got:

    Speed: 26.41 tok/s
Full output: https://gist.github.com/simonw/6f25fca5c664b84fdd4b72b091854...

Replies

jodleif10/12/2024

How much with llama.cpp? A 1b model should be a lot faster on a m2

show 1 reply