logoalt Hacker News

pbronezyesterday at 11:55 AM1 replyview on HN

Are you actually using Exo for local clustered AI inference? I’ve considered it a few times and keep finding horror stories. Never seen someone report it’s actually working well for them.


Replies

znnajdlayesterday at 12:04 PM

No not yet. Planning to. But Qwen3 Coder Next 4bit runs decently well with LM Studio on my M3 Max with 96 GB RAM (50 tok/s at low context).