Paper page - QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents
…Social deduction games have become a popular testbed for evaluating reasoning, deception, coordination, and belief modeling in large language models. However, most environments evaluate agents primarily through game outcomes like win rates…