Paper page - Diffusion Model as a Generalist Segmentation Learner
…A parallel CLIP-aligned text pathway injects language features across multiple scales, enabling the model to align textual queries with evolving visual representations. This design transforms an off-the-shelf diffusion backbone…