Paper page - TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
…Xinkai Ma , , , , , , , Qianqian Xie , , , , , , , , , Minghao Liu , , , , , Abstract A multimodal deep research benchmark and agent framework are introduced to evaluate and improve the factual reliability and visual alignment of automated report generation systems…