Hugging Face AI Updates: April 25, 2026
1. DeepSeek-V4: A Million-Token Context That Agents Can Actually Use
Hugging Face. Hugging Face published a technical breakdown of DeepSeek-V4, focusing on the hybrid attention design — Compressed Sparse Attention at 4x and Heavily Compressed Attention at 128x — that drops single-token inference to 27% of V3.2’s FLOPs and the KV cache to 10% on the Pro variant, making the 1M-token context economically viable. The post details V4’s agent-oriented post-training: interleaved thinking preserved across tool-call rounds, an XML schema with dedicated |DSML| tokens to reduce parsing errors, and the DSec Rust sandbox that exposes function calls, containers, microVMs, and full VMs to RL training. Reported results include 80.6 on SWE Verified, 67.9 on Terminal Bench 2.0, and 0.59 accuracy on a 1M-token needle-in-haystack task. Source
2. Building a Manifest V3 Chrome Extension With Transformers.js
Hugging Face. Hugging Face published a hands-on guide to building Manifest V3 Chrome extensions with Transformers.js, using the Gemma 4 Browser Assistant as the reference implementation. The tutorial walks through the three-tier MV3 architecture (background service worker for model inference, side panel for UI, content scripts for page access), the messaging contracts between the components, and an agent loop that separates internal model messages from user-facing chat with deterministic tool execution. It also covers practical detail on caching, state persistence (memory vs. storage vs. IndexedDB), and multi-entry MV3 build configuration. Source