logoalt Hacker News

cptskippytoday at 4:53 AM3 repliesview on HN

I've been running qwen3-5-9b-q4-k-m and qwen3-6-27b-q6-k simultaneously on an Intel Arc Pro B70 with a lot of success.

https://github.com/cptskippy/battlemage-llm-gateway

Opencode has been a huge productivity accelerator. I have two Hermes agents that I'm training to support my workflow with pretty good success. One is a personal assistant who manages my backlog and keeps me on task, follows up with me on items, and will put together research briefs. The other I use a general purpose coder and research and it's about 50:50 with the tasks I've given it. In fairness though, the task it failed at left me scratching my head to figure out as well.


Replies

hbbiotoday at 5:04 AM

Interesting setup, thx for sharing.

How many tokens/sec do you get with 27b? Are you using MTP?

askvictortoday at 5:36 AM

Does Intel make decent GPUs now? I must be out of the loop...

show 1 reply
jauntywundrkindtoday at 5:16 AM

What's the value running the smaller model too? Why not just the big model for everything? I note both are dense, as well.

show 1 reply