A “diff” tool for AI: Finding behavioral differences in new models
…Finding behavioral differences in new models Mar 13, 2026 Read the paper Every time a new AI model is released, its developers run a suite of evaluations to measure its performance and…