Hugging Face AI Updates: July 1, 2026
1. Every Eval Ever Results Now Surface on Hugging Face Model Pages
Hugging Face. The EvalEval Coalition and Hugging Face detailed an integration that links Every Eval Ever (EEE), a standardized JSON schema for evaluation results, with Hugging Face Community Evals so scores can be cross-posted between the two. When contributors submit results to both, the score appears directly on the relevant Hugging Face model page while keeping a traceable link back to the full EEE record and its methodology. The goal is to reduce the long-standing problem of scattered, inconsistent benchmark scores that are hard to compare across the community. Source
2. ScarfBench Benchmarks AI Agents on Enterprise Java Framework Migration
Hugging Face. IBM Research published ScarfBench on the Hugging Face blog, an open benchmark for evaluating AI agents on enterprise Java framework migrations across Spring, Jakarta EE, and Quarkus. The authors report that even the strongest current agents achieve less than 10 percent behavioral success, showing that agents can produce compilable code but struggle to preserve application behavior. The findings point to dependency resolution, configuration management, and environmental issues as the main failure points rather than code translation itself. Source