logoalt Hacker News

thomascgalvinyesterday at 12:45 PM4 repliesview on HN

Anecdotally, I haven't seen any real improvement from the AI tools I leverage. They're all good-ish at what they do, but all still lie occasionally, and all need babysitting.

I also wonder how much of the jump in early 2025 comes from cultural acceptance by devs, rather than an improvement in the tools themselves.


Replies

rustyhancockyesterday at 12:57 PM

I think I'm coming to the same conclusion Gpt-3 to 5.3 have had real tangible but incremental improvements with quite diminishing returns.

Perhaps we won't see a phase change like improvement as we did from gpt-2 through to 3 until there is several more orders of magnitude parameters and/or training. Perhaps we will never see it again!

What is getting rapidly better is scaffolding but this seems to be more about understanding and building tools around LLMs than the LLMs themselves improving.

I'm still excited about AI but not constantly hyped to the rafters as some.

egworyesterday at 12:47 PM

I think it depends on what you're using it for. If it is a simple kubernetes config then the model doesn't matter too much. Contract that with writing the scenario for a backtest for an algo that trades on a venue: it is not the same complexity and the basic models are terrible. I've had it tell me that it has added tests to find that they're just stubs! Opus seems to be getting there, but on more complex tasks the others are a complete waste of time.

show 1 reply
jwpapiyesterday at 1:44 PM

It’s better pre and post training + better harnessing