logoalt Hacker News

verdvermyesterday at 8:46 PM4 repliesview on HN

I'd second this wholeheartedly

Since building a custom agent setup to replace copilot, adopting/adjusting Claude Code prompts, and giving it basic tools, gemini-3-flash is my go-to model unless I know it's a big and involved task. The model is really good at 1/10 the cost of pro, super fast by comparison, and some basic a/b testing shows little to no difference in output on the majority of tasks I used

Cut all my subs, spend less money, don't get rate limited


Replies

dpoloncsakyesterday at 8:56 PM

Yeah, one of my first projects one of my buddies asked "Why aren't you using [ChatGPT 4.0] nano? It's 99% the effectiveness with 10% the price."

I've been using the smaller models ever since. Nano/mini, flash, etc.

show 3 replies
r_leeyesterday at 8:50 PM

Plus I've found that overall with "thinking" models, it's more like for memory, not even actual perf boost, it might even be worse because if it goes even slightly wrong on the "thinking" part, it'll then commit to that for the actual response

show 1 reply
PunchyHamstertoday at 9:06 PM

LLM bubble will burst the second investors figure out how much well managed local model can do

dingnutsyesterday at 8:52 PM

[dead]