logoalt Hacker News

ChurchillsLlamalast Monday at 10:37 PM0 repliesview on HN

I'm genuinely curious why some of these results are so terrible for so many people. I've built in my own harness, and while I've noticed a degradation of quality, the local harness - as well as validation agents - generally catch these issues. For me, I've had to institute tighter controls and guardrails via hooks but I don't see results that warrant changing to a different provider.