I'd second this wholeheartedly Since building a custom agent setup to replace copilot, adopti...

verdverm • yesterday at 8:46 PM • 4 replies • view on HN

I'd second this wholeheartedly

Since building a custom agent setup to replace copilot, adopting/adjusting Claude Code prompts, and giving it basic tools, gemini-3-flash is my go-to model unless I know it's a big and involved task. The model is really good at 1/10 the cost of pro, super fast by comparison, and some basic a/b testing shows little to no difference in output on the majority of tasks I used

Cut all my subs, spend less money, don't get rate limited

Replies

dpoloncsak • yesterday at 8:56 PM

Yeah, one of my first projects one of my buddies asked "Why aren't you using [ChatGPT 4.0] nano? It's 99% the effectiveness with 10% the price."

I've been using the smaller models ever since. Nano/mini, flash, etc.

➕ show 3 replies

r_lee • yesterday at 8:50 PM

Plus I've found that overall with "thinking" models, it's more like for memory, not even actual perf boost, it might even be worse because if it goes even slightly wrong on the "thinking" part, it'll then commit to that for the actual response

➕ show 1 reply

PunchyHamster • today at 9:06 PM

LLM bubble will burst the second investors figure out how much well managed local model can do

dingnuts • yesterday at 8:52 PM

[dead]

alt Hacker News

Replies