I played textual chess with Fable. It took around 15 moves before it made a large blunder. I asked it to give its reasoning per move and it mistakenly assumed a piece was protected when it wasn’t and after the blunder it realized its fault and did not suggest an illegal move. Other LLMs lost game state far earlier. But a good human chess player can keep the game state in his mind much longer, so this random eval shows a big improvement over old AI models