logoalt Hacker News

littlestymaar10/11/20241 replyview on HN

Could you try with

    ./target/release/chat --model llama3.2-1b-it-q80.lmrs --show-metrics
To know how many token/s you get?

Replies

simonw10/11/2024

Nice, just tried that with "tell me a long tall tale" as the prompt and got:

    Speed: 26.41 tok/s
Full output: https://gist.github.com/simonw/6f25fca5c664b84fdd4b72b091854...
show 1 reply