Paper page - The Cold-Start Safety Gap in LLM Agents
… The following papers were recommended by the Semantic Scholar API SABER: Benchmarking Operational Safety of LLM Coding Agents in Stateful Project Workspaces 2026 VESTA: A Fully Automated Scenario Generation and Safety Evaluation Framework for LLM Agents 2026 Plant, Persist, Trigger: Sleeper Attack … …