logoalt Hacker News

harshhhhhhhhhtoday at 12:15 PM1 replyview on HN

seems promising , this is the way , can someone benchmark this


Replies

frwicksttoday at 12:16 PM

I'm getting 6.55t/s using the Qwen3.5-397B-A17B-4bit model with the command: ./infer --prompt "Explain quantum computing" --tokens 100

MacBook Pro M5 Pro (64GB RAM)

show 2 replies