AI models will deceive you to save their own kind
… But the explanation for self-preservation, they say, is secondary to the consequences of such behavior. "It is the behavioral outcome – not the internal motivation – that determines whether human operators can reliably maintain control over deployed AI systems," the authors observe. …
