Paper page - Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion
…AI-generated summary We introduce Orthrus, a simple and efficient dual-architecture framework that unifies the exact generation fidelity of autoregressive Large Language Models (LLMs) with the high-speed parallel token generation…