Could you try with
./target/release/chat --model llama3.2-1b-it-q80.lmrs --show-metrics
Nice, just tried that with "tell me a long tall tale" as the prompt and got:
Speed: 26.41 tok/s
Nice, just tried that with "tell me a long tall tale" as the prompt and got:
Full output: https://gist.github.com/simonw/6f25fca5c664b84fdd4b72b091854...