I ended up overengineering a LangGraph workflow to handle this. It forces the LLM to generate and pa...

storystarling • yesterday at 10:26 PM • 0 replies • view on HN

I ended up overengineering a LangGraph workflow to handle this. It forces the LLM to generate and pass its own tests in a sandbox before I even see the PR. The API costs are significantly higher because of the retry loops, but it filters out the low effort attempts.

alt Hacker News