Paper page - On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment
…AI-generated summary Tool-using LLM agents fail through trajectories rather than only final responses, as they may execute unsafe tool calls, follow injected instructions, comply with harmful requests, or over-refuse…