logoalt Hacker News

esafakyesterday at 2:43 PM1 replyview on HN

You don't know what the model is capable of until you try. Maybe today's models are not good enough. Try again next year.


Replies

jeffrallenyesterday at 3:54 PM

This is true, but also: everything I try works!

I simply cannot come up with tasks the LLMs can't do, when running in agent mode, with a feedback loop available to them. Giving a clear goal, and giving the agent a way to measure it's progress towards that goal is incredibly powerful.

With the problem in the original article, I might have asked it to generate 100 test cases, and run them with the original Perl. Then I'd tell it, "ok, now port that to Typescript, make sure these test cases pass".

show 2 replies