NVIDIA AI Updates: June 12, 2026
1. NVIDIA Details Deploying MiniMax M3 for Long-Context Reasoning and Agentic Workflows
NVIDIA published guidance for deploying MiniMax M3, a 428B-parameter Mixture-of-Experts multimodal model, on NVIDIA Blackwell infrastructure. The model processes video, images, and text natively, supports up to 1M tokens of context, and targets agentic workflows such as long video understanding and extended coding sessions. Deployment paths include TensorRT-LLM, SGLang, vLLM, NVIDIA Dynamo for disaggregated inference, and the NeMo Framework for fine-tuning, with reported gains of 9x faster prefill and 15x faster decoding versus its predecessor and a 4x interactivity improvement under Dynamo at 32k input length. Source
2. NVIDIA Adds One-Click Multi-Tenant Security to Quantum InfiniBand
NVIDIA introduced intent-based security profiles in the Unified Fabric Manager for Quantum InfiniBand, enabling multi-tenant fabric security to be applied in a single click. The profiles, spanning General, Bare Metal Cloud, and Secured Bare Metal Cloud tiers, automatically configure Partition Key isolation, Management Datagram key protection, GUID-based access control, and continuous validation, reducing deployment from hours or days to minutes. Port assignments are stored in hardware and controlled by the Subnet Manager for cryptographically and logically separated tenant isolation, and a new Continuous Security Verification tool provides automated auditing and a Security Health Score. Source