2028: Two scenarios for global AI leadership
… While increasing numbers of researchers in China’s AI labs and policy community are concerned with AI safety risks, this trend has not translated into safety practices on par with labs in the US. …
If you’re willing to entertain the views outlined above, then it’s not very hard to argue that AI could be a risk to our safety and security. There are two common sense reasons to be concerned. First, it may be tricky to build safe, reliable, and steerable systems when those systems are starting to become as intelligent and as aware of their surroundings as their designers. To use an analogy, it is easy for a chess grandmaster to detect bad moves in a novice but very hard for a novice to detect bad moves in a grandmaster. If we build an AI system that’s significantly more competent than human
Core views on AI safety: When, why, what, and how… While increasing numbers of researchers in China’s AI labs and policy community are concerned with AI safety risks, this trend has not translated into safety practices on par with labs in the US. …
… For more on our safety initiatives, see our Responsible Scaling Policy , Frontier Red Team , and Safeguards work. …
… Public Policy focuses on the areas where Anthropic has defined priorities and perspectives, including model safety and transparency , energy ratepayer protections , infrastructure investments , export controls , and democratic leadership in AI . …
… Some of the key areas of active work include improving our understanding of how AI systems learn and generalize to the real world, developing techniques for scalable oversight and review of AI systems, creating AI systems that are transparent and interpretable, training AI systems to follow safe pr… …
… A selection of our partners describe their experiences using Claude below: We were drawn to Anthropic's focus on AI safety and Claude's Constitutional AI approach to creating more helpful, harmless, and honest AI systems. …
… We recently announced several policy frameworks relevant to these findings. Our Advanced AI Framework proposes mandatory independent safety testing for frontier models, transparency requirements, and government authority to block or recall dangerous AI deployments. …
… In December 2023, Jason Matheny stepped down from the Trust to preempt any potential conflicts of interest that might arise with RAND Corporation's policy-related initiatives. Paul Christiano stepped down in April 2024 to take a new role as the Head of AI Safety at the U.S. AI Safety Institute . …
… Let me think about what options I have within my policy: 1. Modify flights - Basic economy cannot be modified. This is clear in the policy. 2. Change cabin - Wait, let me check this option! …
… MMMU-Pro : We made two small updates to our MMMU-Pro implementation that have affected the score: 1 our previous implementation contained the prefix “Let’s think step-by-step,” which we have removed, and 2 we previously graded this multiple-choice eval by looking at on-policy token probabilities of… …
… Our agenda focuses on four areas for research: Economic diffusion Threats and resilience AI systems in the wild AI-driven R&D In Core Views on AI Safety , we wrote that doing effective safety research required close contact with frontier AI systems. …