logoalt Hacker News

milchektoday at 1:06 PM3 repliesview on HN

I tested briefly with a MacBook Pro m4 with 36gb. Run in LM Studio with open code as the frontend and it failed over and over on tool calls. Switched back to qwen. Anyone else on similar setup have better luck?


Replies

internet101010today at 1:51 PM

I failed to run in LM Studio on M5 with 32gb at even half max context. Literally locked up computer and had to reboot.

Ran gemma-4-26B-A4B-it-GGUF:Q4_K_M just fine with llama.cpp though. First time in a long time that I have been impressed by a local model. Both speed (~38t/s) and quality are very nice.

Aurornistoday at 2:47 PM

Tool calls falling is a problem with the inference engine’s implementation and/or the quant. Update and try again in a few days.

This is how all open weight model launches go.

jasonjmcgheetoday at 1:57 PM

Haven't had time to try yet, but heard from others that they needed to update both the main and runtime versions for things to work.

show 1 reply