Paper page - AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery
…We publicly release the dataset and evaluation pipeline to facilitate future research in this direction. We publicly release the dataset, evaluation pipeline, and code at https://github.com/CherYou/ AutoResearchBench . View arXiv…