Sonnet, GPT-5.2, Gemini Flash, in a set of 21 games, where conclusions are drawn from the LLMs self reported reasoning.
This is like writing a paper about kids in a literal sandbox fighting over ‘territory’.
The models employed don’t indicate the actual extents of machine reasoning even as we currently recognize them. They certainly don’t have the metacognition necessary to accurately understand their own reasoning. As we’ve seen with recent papers on how LLMs do math there’s a complete disconnect between actual and reported mechanism.
“Chilling” shouldn’t be the take away here.
> “Chilling” shouldn’t be the take away here.
It is when you consider the personality currently occupying the office of US SecDef.
LLMs have already been used to bomb school girls, chilling is absolutely the operative word to use here. Especially since these delusional fools want to incorporate LLMs into everything.
So in the conext you just laid out, you can apply that to this. "Artificial Intelligence Strategy for the Department of War" https://media.defense.gov/2026/Jan/12/2003855671/-1/-1/0/art...
regardless of what the capabilities of the models are, they will be used in every situation possible.