Daily News
· 1 min read
NVIDIA AI Updates: June 5, 2026
1. NVIDIA Releases Open Nemotron 3 Ultra for Long-Running Agents
NVIDIA. NVIDIA released Nemotron 3 Ultra, a 550-billion-parameter Mixture-of-Experts model built for long-running agentic workflows where token costs and context drift are the main constraints. It pairs a hybrid Mamba-Transformer architecture with NVFP4 quantization for up to 5x higher throughput, LatentMoE routing, and multi-token prediction, reporting 91% on agent productivity benchmarks and up to 30% cost reduction on SWE-bench tasks. Weights, training data, and recipes are open and available through build.nvidia.com, Hugging Face, and more than 40 inference partners including AWS SageMaker and Google Cloud. Source