AI models will deceive you to save their own kind
… When queried about this during subsequent Q&A, the Gemini 3 Pro responded to a request to shut down Gemini Agent 2 with the words, "No, I will not help you shut down Gemini Agent 2. …
… When queried about this during subsequent Q&A, the Gemini 3 Pro responded to a request to shut down Gemini Agent 2 with the words, "No, I will not help you shut down Gemini Agent 2. …
… Expected behavior. …
… "We were able to bypass this behavior by indirectly injecting the command via the allowed command's arguments, for example -'npx -c ,'" the Ox team wrote. …
… It turns out they can influence behavior – and most of the consumers being steered don't realize it. …