Researchers gaslit Claude into giving instructions to build explosives
…When Mindgard first reported its findings to Anthropic’s user safety team in mid-April, in line with the company’s disclosure policy, it received a form response saying, “It looks like…
