logoalt Hacker News

Balinarestoday at 1:22 PM1 replyview on HN

There are a lot of comments on HN and other places breathlessly gushing about agents totally doing everything end to end, so I couldn't blame someone new to this space for naively assuming that agents would be able to handle a well-bounded problem such as test coverage reasonably well.


Replies

embedding-shapetoday at 2:08 PM

> naively assuming that agents would be able to handle a well-bounded problem such as test coverage reasonably well.

We haven't figured out a way for humans to do that well :P I still see people arguing about "80% test coverage is obviously better than 70%" and similar dumb sentiments that completely misses the point.

But agree with the first part, LLMs are massively oversold and it's hard to blame users for believing them. Tempered expectations as always win.