Paper page - LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis
…Further analysis shows that additional agent steps do not necessarily improve performance, suggesting that the key bottleneck is maintaining a correct analytical state rather than increasing interaction budget . We release LongDS to…