logoalt Hacker News

manapausetoday at 4:49 PM1 replyview on HN

Can confirm, my experience in “loop engineering” was “this is neat” for 45 minutes until a daily ration of tokens was evaporated. The quadratic cost trap is prohibitive to experimentation.

As a localLLM evangelist, I am hopeful this will bring more attention to the joys of rolling your own sovereign AI.


Replies

sleepybretttoday at 5:00 PM

Yeah, i'm hoping that gets smoother. I've been experimenting with omlx and opencode on my m5x64gb and keep running into issues w/ Qwen3.6-35B-A3B-MLX-8bit exceeding it's memory limit at the most inopportune times. Playing with 12B gemma4 (8bit) more today.

Maybe I should be aiming for something targeting 48gb of memory?

show 1 reply