Browse latest
Research & PapersMarkTechPost · May 20, 2026

NVIDIA AI Releases Nemotron-Labs-Diffusion: A Tri-Mode Language Model with 6× Tokens Per Forward Over Qwen3-8B

NVIDIA AI Releases Nemotron-Labs-Diffusion: A Tri-Mode Language Model with 6× Tokens Per Forward Over Qwen3-8B — MarkTechPost

NVIDIA introduces Nemotron-Labs-Diffusion, a language model with three decoding modes: autoregressive, diffusion, and self-speculation. This innovation offers up to six times more tokens per forward pass compared to traditional models, enhancing efficiency without sacrificing accuracy. This novel architecture allows for flexible deployment across various computational environments. Each of its decoding modes is tailored for specific deployment contexts, from high-concurrency cloud serving to handling multiple tokens in parallel. This advancement sets a new benchmark in language model performance and adaptability. This model is trained using a unique joint AR-diffusion objective that ensures robust performance across different tasks and scales.

Author: Morein.ai Editorial

NVIDIA has introduced Nemotron-Labs-Diffusion, a language model family that integrates three decoding modes within a single architecture. This model supports autoregressive (AR) decoding, diffusion-based parallel decoding, and self-speculation decoding. Available in 3B, 8B, and 14B parameter sizes, the family also includes base, instruct, and vision-language variants, providing versatility for various applications. This unified approach addresses the limitations of traditional models by enhancing processing efficiency and adaptability.

Traditional autoregressive language models process text one token at a time, creating a sequential dependency that restricts GPU parallelism and leads to low hardware utilization in typical deployment scenarios. Diffusion language models, conversely, denoise multiple tokens in parallel per forward pass, offering higher throughput. While previous diffusion models often lagged in accuracy, Nemotron-Labs-Diffusion

Read original source

Related articles