NVIDIA AI Updates: May 29, 2026

1. Dynamo Snapshot Cuts Inference Cold-Start Times Up to 21x on Kubernetes

NVIDIA. NVIDIA released Dynamo Snapshot, an experimental checkpoint/restore mechanism that uses CRIU and CUDA driver checkpointing to slash the multi-minute cold starts that leave GPUs idle when inference pods scale on Kubernetes. A KV-cache unmap step shrinks the checkpoint artifact for Qwen3-0.6B from roughly 190 GiB to 6 GiB, and parallel memory restoration plus native async I/O delivers a 21x startup-time reduction on large models like gpt-oss-120b, with sub-five-second restores on fast storage. The initial release supports single-GPU vLLM and SGLang workloads, with multi-GPU and TensorRT-LLM support planned. Source

2. Blackwell Posts a STAC-AI LANG6 Record for Financial LLM Inference

NVIDIA. NVIDIA reported that its Blackwell architecture set a record on the STAC-AI LANG6 benchmark for LLM inference in financial services, claiming up to 2.8x single-GPU gains over the prior generation across Llama 3.1 analysis tasks. An eight-GPU HGX B200 system led on throughput, hitting 52,823 words per second and 311 requests per second on Llama 3.1 8B with the EDGAR4 dataset, while a two-GPU Supermicro RTX PRO 6000 Blackwell box was positioned as the space- and cost-efficient option. The audited STAC framing matters more than NVIDIA’s usual self-reported numbers, though the comparison is still confined to NVIDIA’s own hardware tiers. Source

3. ICRA Papers Push Sim-to-Real Robotics Gains

NVIDIA. NVIDIA Research presented eight papers at ICRA centered on transferring policies trained in simulation to physical robots. The standouts include COMPASS, a cross-embodiment navigation framework that reports a 4.5x success-rate improvement over imitation-learning baselines and roughly 80% success in real-world trials, and Grasp-MPC, an adaptive grasping method that lifts real-robot success to around 75% from a 41% baseline. PEEK, a vision-language perception pipeline that filters for task-relevant objects, claims a 41x real-world accuracy improvement for simulation-trained policies, the sort of result that is worth treating as a directional signal rather than a settled number until others reproduce it. Source