logoalt Hacker News

somattoday at 3:06 AM0 repliesview on HN

"AMD’s AI director reports that Claude Code has become “dumber and lazier” since February, based on analysis of 6,852 sessions and 234,760 tool calls, which is the most thorough performance review any AI has received and rather more than most human employees get."

Are there any good ways to measure agent ability? Or do we just have to go by vibes?