Search

Showing top 55 results for "AI training and model updates"

Paper page - MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models

…Xiyu Ren , , Yiming Du , , , , , , , , , , , Abstract A new benchmark evaluates memory capabilities in vision-language models through multi-session conversations, revealing limitations of both long-context and memory-augmented approaches. AI-generated summary…

May 15, 2026

Paper page - JLT: Clean-Latent Prediction in Latent Diffusion Transformers

…We introduce JLT, a 130M latent diffusion Transformer over frozen FLUX.2 VAE codes, and compare clean-latent prediction with a matched velocity-prediction DiT under the same representation, backbone, and training…

May 27, 2026

Paper page - Motion-Aware Caching for Efficient Autoregressive Video Generation

…Inference-Time Mixed-Precision Quantization for Video Diffusion Models (2026) Streaming Autoregressive Video Generation via Diagonal Distillation (2026) PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference (2026…

May 6, 2026

Paper page - From Context to Skills: Can Language Models Learn from Context Skillfully?

…Continual Learning from Experience and Skills in Multimodal Agents (2026) Training-Free Test-Time Contrastive Learning for Large Language Models (2026) OSExpert: Computer-Use Agents Learning Professional Skills via Exploration (2026) WebXSkill…

May 5, 2026

Paper page - Unlocking Complex Visual Generation via Closed-Loop Verified Reasoning

…AI-generated summary Despite rapid advancements, current text-to-image (T2I) models predominantly rely on a single-step generation paradigm, which struggles with complex semantics and faces diminishing returns from parameter scaling…