AI agents create new risks requiring continuous monitoring and oversight
… This includes pre-deployment testing ahead of models being deployed, as well as continuous monitoring that can track changes to behavior when it encounters real-world scenarios. Even an AI agent that has been robustly tested pre-deployment can behave in unplanned ways once it is live. …