It gets it wrong 100% of the time. A script to validate would send it into an infinite loop of gener...

al_borland • today at 3:19 AM • 1 reply • view on HN

It gets it wrong 100% of the time. A script to validate would send it into an infinite loop of generating code and failing validation.

Replies

simonw • today at 3:24 AM

Are you sure about that?

I don't think I've ever seen Opus 4.5 or GPT-5.2 get stuck in a loop like that. They're both very good at spotting when something doesn't work and trying something else instead.

Might be a problem with older, weaker models I guess.

➕ show 1 reply

alt Hacker News

Replies