Could you try with ./target/release/chat --model llama3.2-1b-it-q80....

littlestymaar • 10/11/2024 • 1 reply • view on HN

Could you try with

    ./target/release/chat --model llama3.2-1b-it-q80.lmrs --show-metrics

To know how many token/s you get?

simonw • 10/11/2024

Nice, just tried that with "tell me a long tall tale" as the prompt and got:

    Speed: 26.41 tok/s

➕ show 1 reply

alt Hacker News