Simulations are only as good as the reality representations they are based on. If they keep using tactical nukes, they've been fed by weak data. Do the war games include the broader economic and politic environments that military successes are won on? WWI was settled by a naval blockade.
Agreed. But I'm not sure sure which decision maker is more myopic toward the big picture and long-lasting implications of a decision: an LLM, or the top brass at the Department Of War.
People like to talk tough online. They tend to change their rhetoric in person. Our "training data" is problematic by design.
I suspect it's more that the text data doesn't exist. They're trained on text that was recorded. How often has it been publicly recorded when a nuke was not used, with any context around that lack of use?
From the text perspective, it's something that has to be inferred indirectly. If you went through all relevant training data and appended ", we decided not to use a nuke", I suspect the results would be improved.