logoalt Hacker News

10xDevyesterday at 7:35 PM1 replyview on HN

It is a program. I need it to get task X done and I don't care how, whether it is strictly through CoT or with tools. There is no such thing as cheating in real work and no reason to handicap it. Just test the limits of what it can do with whatever means possible.

Trying to solve everything with CoT alone without utilising tools seems futile.


Replies

simianwordsyesterday at 7:51 PM

you are not understanding. its a proxy for how well it does other things.

show 1 reply