Daily News
· 1 min read
Meta AI Updates: June 4, 2026
1. PyTorch Adds Muon Optimizer Support to DeepSpeed
Meta. PyTorch published a post detailing new DeepSpeed support for the Muon Optimizer, which uses a single momentum buffer per parameter instead of Adam’s two for roughly 9% GPU memory savings, and outperforms AdamW on three of four reported benchmarks including a 1.8 point gain on MMLU. Muon operates as a hybrid system, applying Muon updates to 2D hidden weight matrices while falling back to AdamW for embeddings and layer norms. The optimizer has been adopted by frontier labs for large foundation models, including Moonshot AI’s Kimi-K2-Thinking, Zhipu AI’s GLM-5 at 744B parameters, and DeepSeek-V4 at 1.6T parameters. Source