I think running them against each other with a rules engine would be more interesting. Count up ille...

derac • today at 2:32 AM • 0 replies • view on HN

I think running them against each other with a rules engine would be more interesting. Count up illegal moves and wins/unfinished games. I think llm grading is too unreliable.

alt Hacker News