logoalt Hacker News

fc417fc802today at 11:14 AM1 replyview on HN

Rephrase that in terms of the human mechanic and hopefully you can see the error of that reasoning. LLMs that perform tasks (as opposed to merely holding conversations) use tools just like we do. That's literally how we design them to operate.

In fact the LLMs that everyone uses today typically have access to specialized task specific tooling. Obviously specialized tools aren't appropriate for a test that measures the ability to generalize but generic tools are par for the course. Writing a bot to play a game for you would certainly serve to demonstrate an understanding of the task.


Replies

UltraSanetoday at 5:24 PM

I'm pretty sure the LLM can use tools while doing arc-agi-3 but it has to the same tools available all the time not an incredibly elaborate custom harness.

show 1 reply