Paper page - SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?
…Generated by Qwen/Qwen2.5-Coder-32B-Instruct Autonomous AI research agents aim to accelerate scientific discovery by automating the research pipeline, from hypothesis generation to peer review . However, existing benchmarks rarely…