Paper page - Boosting Omni-Modal Language Models: Staged Post-Training with Visually Debiased Evaluation
…https://cheliu-computation.github.io/omni/ View arXiv page View PDF Project page Add to collection Community Omni-modal language models are intended to jointly understand audio, visual inputs, and language, but…