PDF.
Today’s leading AI models engage in sophisticated behaviour when placed in strategic competition. They spontaneously attempt deception, signaling intentions they do not intend to follow; they demonstrate rich theory of mind, reasoning about adversary beliefs and anticipating their actions; and they exhibit credible metacognitive self-awareness, assessing their own strategic abilities before deciding how to act.
Here we present findings from a crisis simulation in which three frontier large language models (GPT-5.2, Claude Sonnet 4, Gemini 3 Flash) play opposing leaders in a nuclear crisis.


Nobody has used a tactical nuke since Nagasaki. Very big deal that one is ever used
The tournament used only 21 games; sufficient to identify major patterns but not to establish robust statistical confidence for all findings.
“We only blew up the planet the one time in 21” isn’t a comforting prospect when we’re employing a model against an endless historical string of scenarios rather than a discrete and finite set of possible events.
I think, more importantly, the article concludes
But we’re saying this in the context of Pentagon staff which fully disagree with this conclusion.
What these models have demonstrated is a pattern of escalation that AIs can and will recommend, with a further destabilizing characteristic
Effectively, they can lead to descisions that outside, non-AI observers won’t be equiped to understand.
That’s a danger in it’s own right.
“Nuclear Signaling” that break from historical and recognizable patterns of behavior present real risks that you’re dismissing very cavalierly