logoalt Hacker News

cyanydeezyesterday at 6:15 PM2 repliesview on HN

Sounds like you're a candidate for a local model. It's kinda nice not caring what the token count means except as to compaction.


Replies

brushfootyesterday at 6:39 PM

Not paying per token? Not sending my code to someone else's servers for inference? That's the stuff of sweet dreams for a stingy, paranoid solopreneur like me.

If I could run a local model comparable to even Sonnet 4.6 without shelling out $50K in hardware, I'd do it in a heartbeat. But all I have is a 32 GB of RAM and an old RTX 4080.

Or am I not up to speed? Are there decent coding models that can run on dev laptops? Not that that's what you were suggesting by recommending a local model, necessarily; just curious.

show 1 reply
kanemcgrathyesterday at 6:24 PM

I do love using local models when I can, but qwen-35B is the best model I can run, and while its an insanely good local model, it does not compare to the big ones.

show 1 reply