logoalt Hacker News

frwicksttoday at 12:16 PM2 repliesview on HN

I'm getting 6.55t/s using the Qwen3.5-397B-A17B-4bit model with the command: ./infer --prompt "Explain quantum computing" --tokens 100

MacBook Pro M5 Pro (64GB RAM)


Replies

j45today at 1:19 PM

Appreciate the data point. M5 Max would also be interesting to see once available in desktop form.

logicalleetoday at 12:45 PM

can you post the final result (or as far as you got before you killed it) to show us how cohesive and good it is? I'd like to see an example of the output of this.

show 1 reply