logoalt Hacker News

Vegenoidtoday at 3:18 AM0 repliesview on HN

This would require LLMs being good at knowing when they are doing a bad job, which they are still terrible at. With a good testing and verification harness set up, sure, then it could just go to a more powerful model if it can't make tests pass. But not a lot of usage is like this.