logoalt Hacker News

tottenhmyesterday at 9:23 PM0 repliesview on HN

> In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it.

The agent passes the Turing test...