logoalt Hacker News

10xDevtoday at 6:54 PM5 repliesview on HN

If AI can program, why does it matter if it can play Chess using CoT when it can program a Chess Engine instead? This applies to other domains as well.


Replies

RivieraKidtoday at 8:54 PM

It can write a chess engine because it has read the code of a thousand of chess engines. This benchmark measures a different aspect of intelligence.

And as a poker player, I can say that this game is much more challenging for computers than chess, writing a program that can play poker really well and efficiently is an unsolved problem.

show 1 reply
NitpickLawyertoday at 8:11 PM

> If AI can program, why does it matter if it can play Chess using CoT when it can program a Chess Engine instead?

Heh, we really did come full circle on this! When chatgpt launched in dec22 one of the first things that people noticed is that it sucked at math. Like basic math 12 + 35 would trip it up. Then people "discovered" tool use, and added a calculator. And everyone was like "well, that's cheating, of course it can use a calculator, but look it can't do the simple addition logic"... And now here we are :)

show 1 reply
CooCooCaChatoday at 8:56 PM

CoT is upstream of building a chess engine.

Chess engines don’t grow on trees, they’re built by intelligent systems that can think, namely human brains.

Supposedly we want to build machines that can also think, not just regurgitate things created by human brains. That’s why testing CoT is important.

It’s not actually about chess, it’s about thinking and intelligence.

simianwordstoday at 7:25 PM

Its the same reason we are asked to write exams without using calculators but the real world does have them.

How you work without calculators is a proxy for real world competency.

show 1 reply
Davidzhengtoday at 7:08 PM

They should be allowed to! In fact i think better benchmark would be to invent new games and test the models ability to allocate compute to minmax/alphazero new games in compute constraints