Search

Showing top 105 results for "Safety for agents"

All sources developer.nvidia.com 27 huggingface.co 21 theregister.com 9 anthropic.com 6 deepmind.google 5 blogs.nvidia.com 5 xda-developers.com 4 theverge.com 3 9to5google.com 2 fudzilla.com 2 blog.google 2 semiwiki.com 2

Videos

Introducing Claude Opus 4.5

…Claude Opus 4.5 represents a breakthrough in self-improving AI agents . For automation of office tasks, our agents were able to autonomously refine their own capabilities—achieving peak performance in 4…

Nov 24, 2025

Anthropic blames dystopian sci-fi for training AI models to act “evil”

…The problem, the researchers theorize, is that this kind of RLHF safety training couldn’t possibly cover every single type of ethically difficult situation an agentic AI might encounter. When a modern…

May 13, 2026 · Kyle Orland

How to Minimize Game Runtime Inference Costs with Coding Agents | NVIDIA Technical Blog

NVIDIA ACE is a suite of technologies for building AI agents for gaming. ACE provides ready-to-integrate cloud and on-device AI models for every part of in-game characters, from…

Mar 3, 2026 · Brandon Rowlett

Paper page - Audio-Visual Intelligence in Large Foundation Models

…AVI systems, etc. 🔹 Interaction Dialogue systems, embodied agents, conversational AVI, agentic multimodal systems, and interactive world modeling. ✨ Highlights of this survey: 📚 A unified taxonomy for AVI tasks and paradigms 🧠 Foundations of…

May 8, 2026

Discussions and forums

Hacker News · u/mosiddi · Jan 30, 2026

Show HN: Agent OS – Safety-first platform for building AI agents with VS Code

Hi HN, I built Agent OS because I was tired of the "orchestration tax" – writing the same safety checks, memory management, and tool-handling code in every AI agent project. What it does: - Visual policy edit…

Hacker News · u/lucarizzo1010 · 1w ago

Show HN: AgentShield – Stop AI agents from spending money unsupervised

I'm a recent grad from UMich and built AgentShield because agentic AI is moving fast but payment safety hasn't caught up. Agents are already being handed API keys, stablecoin wallets, and payment credentials - if one mis…

2 1

Hacker News · u/podlp · Apr 28, 2026

Show HN: iClaw is part OpenClaw, part Siri, powered by Apple Intelligence

Hi HN,Last month at a SundAI hackathon, my team built a prototype for an app called iClaw. The goal was to develop an AI agent using Apple Intelligence. I've since continued hacking away at this idea when I had time, and…

Hacker News · u/deepakakkil · 2w ago

Show HN: Emergence World: World building as a way to evaluate LLMs

Current LLM benchmarks are broken. We think long horizon "world" building could be an interesting additional way to evaluate LLMs, since it combines many aspects such as need for advanced reasoning, tool calling, working…

Co-Scientist: A multi-agent AI partner to accelerate research

…Proximity agent - Maps and clusters generated hypotheses to help ensure a diverse, comprehensive exploration of the research space. Debate ideas: Reflection agent - Acts as a "virtual peer reviewer," critically evaluating hypotheses for…

May 19, 2026 · Co-Scientist team

Paper page - Stable-GFlowNet: Toward Diverse and Robust LLM Red-Teaming via Contrastive Trajectory Balance

…AI-generated summary Large Language Model (LLM) Red-Teaming, which proactively identifies vulnerabilities of LLMs, is an essential process for ensuring safety. Finding effective and diverse attacks in red-teaming is important…

May 4, 2026

Building Telco Reasoning Models for Autonomous Networks with NVIDIA NeMo | NVIDIA Technical Blog

…accuracy and safety Initial evaluation focuses on incident summary accuracy: how well the model, embedded in a ReAct‑style agent with tools, predicts and executes the correct resolution path for a given…

Mar 1, 2026 · Aiden Chang

Mastering Agentic Techniques: AI Agent Customization | NVIDIA Technical Blog

…This post explains nine techniques for customizing AI agents, along with criteria for selecting the right techniques for your use case. To learn about evaluating AI agents, see Mastering Agentic Techniques: AI…

May 20, 2026 · Edward Li

Netflix, Meta, IBM speakers discuss AI and their workdays

…Once he sets off one agent to implement some new feature, he tasks another agent to do the preliminary work for the next task he has in mind. In effect, he is…

Apr 4, 2026 · Joab Jackson

Paper page - Code World Model Preparedness Report

…Prompt Engineering for Code Generation (2026) A Systematic Approach for Large Language Models Debugging (2026) Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety (2026) Emergent…

May 6, 2026

Followed topics