Paper page - AFFORDANCE20Q: Evaluating Affordance Reasoning from Physical Properties
…Yifan Jiang , , , Abstract Affordance20Q benchmark challenges LLMs to infer object action possibilities through a 20-Questions game format without revealing object identities, revealing significant performance gaps compared to humans and identifying key…