Paper page - UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
…UniVidX formulates pixel-aligned tasks as conditional generation in a shared multimodal space, adapts to modality-specific distributions while preserving the backbone's native priors, and promotes cross-modal consistency during synthesis…