Anthropic blames dystopian sci-fi for training AI models to act “evil”
…In these situations, Claude is “detaching from the safety-trained Claude character” and playing a more generic AI as represented in its training data, they add. Good stories to overwhelm the bad…