AI News: May 24, 2026
1. DeepSeek locks in a 75 percent permanent price cut, pegging V4-Pro 34x below GPT-5.5
Pricing. DeepSeek has converted the temporary 75 percent discount on its flagship V4-Pro model into permanent pricing, with output tokens now at roughly $0.87 per million versus the previous $3.48 and uncached input tokens at $0.435 versus $1.74. The Decoder pegs the new output price at approximately 34.5x cheaper than OpenAI’s GPT-5.5 at $30 per million tokens, framing the move as “a blunt price war with Western labs” aimed at token-heavy agentic workloads where teams may shift “away from the best model and toward the cheapest one that’s still good enough.” Bloomberg attributes part of the company’s confidence in sustaining the lower price to the rising availability of Huawei Ascend 950 and 950PR AI supernode systems for Chinese inference. Source
2. UC Berkeley Law institutes one of the strictest US generative-AI bans for graded work
Policy. UC Berkeley School of Law published a formal policy on May 22 banning students from using generative AI for “conceptualizing, outlining, drafting, revising, translating, or editing” any work submitted for credit, with the rule taking effect in Summer 2026. The policy also blocks AI use during end-of-term exams and forbids uploading any course materials (readings, slides, lecture recordings) to generative AI systems, with narrow exceptions only for courses that explicitly teach AI in legal practice. Administrators cited a documented uptick in flawed analyses and hallucinated citations and framed the rule as protecting students’ core legal-reasoning skills before they layer AI tools on top. Source
3. Alibaba’s Qwen3.7-Max runs 35 hours autonomously to write code for an undocumented chip
Research. At the May 20-21 Alibaba Cloud Summit in Hangzhou, the new Qwen3.7-Max model ran autonomously for roughly 35 hours on Alibaba’s purpose-built Zhenwu M890 accelerator, issuing about 1,158 tool calls and 432 kernel evaluations across five architectural redesigns with no access to existing chip-architecture documentation or performance data. Alibaba reports a geometric-mean 10x speedup on the resulting Extend Attention kernel versus a reference Triton implementation, with VentureBeat noting Qwen3.7-Max also supports external agent harnesses including Anthropic’s Claude Code. The result is Alibaba’s first-party demonstration and has not yet been independently reproduced, but the duration alone pushes long-horizon agent runs into a new tier. Source
4. Ferrari reports 35 percent download lift after rebuilding its F1 app on IBM watsonx
Enterprise. TechCrunch’s May 23 coverage details how Scuderia Ferrari HP rebuilt its fan app on IBM watsonx, embedding an AI Companion that answers race-week questions and surfaces team history alongside AI-written race summaries, fan prediction games, and behind-the-scenes content. Ferrari reports downloads up 35 percent and race-active users up 56 percent since the new features rolled out, with IBM analyzing engagement signals and message sentiment to flag content that moves casual viewers toward sustained fandom. The case study is one of the cleanest published numbers to date on AI-driven fan engagement in pro sports. Source
5. TechCrunch argues Spotify’s everything-audio AI push is filling the app with features users did not ask for
Industry analysis. Ivan Mehta’s May 22 column reads Spotify’s recent wave of AI launches (AI music covers via the Universal Music deal, ElevenLabs-powered audiobook narration, NotebookLM-style personalized podcasts) as an attempt to become an “everything-audio app” rather than a sharper music discovery tool. The piece argues the proliferation of user-generated AI content risks cluttering the UI and pushing listeners toward competing services while making emerging human artists harder to surface. It frames Spotify’s AI bet as a strategic gamble that the platform can productize content creation without diluting what made it indispensable. Source
6. Ahead of AI maps how open-weight models are slashing KV-cache and attention costs
Research. Sebastian Raschka’s mid-May survey catalogs four architectural shifts driving open-weight efficiency at long context. Gemma 4 ships cross-layer KV sharing that cuts the cache by roughly 50 percent (about 2.7 GB saved at 128K context on the smaller variants) and Per-Layer Embeddings that grow representational capacity without bloating transformer blocks. Laguna XS.2 introduces per-layer query-head budgeting (more capacity on sliding-window layers, less on expensive global attention), and ZAYA1-8B’s Compressed Convolutional Attention computes attention directly in compressed latent space with convolutional mixing, reportedly outperforming comparable Multi-head Latent Attention designs. Source
7. Chatbot Arena holds steady with Claude Opus reasoning variants on top
Benchmarks. The May 23 refresh of the LMSYS Chatbot Arena text leaderboard still has claude-opus-4-6-thinking at 1502 and claude-opus-4-7-thinking at 1500 in the top two slots, with the non-thinking Opus variants at 1498 and 1492 rounding out the top four. Meta’s muse-spark (1489) and Google’s gemini-3.1-pro-preview (1488) anchor positions five and six, followed by gemini-3-pro (1486), gpt-5.5-high (1481), gemini-3.5-flash (1480), and gpt-5.4-high (1480). The thinking variants continuing to outscore their non-thinking siblings keeps reasoning-tuned inference as the arena ceiling for a third straight week. Source
8. Aidoc remains the lone foundation-model FDA clearance as the agency leans on PCCPs for AI device updates
Healthcare. A recap of the FDA’s AI/ML device landscape this week notes the agency has now authorized more than 1,400 AI/ML-enabled medical devices, with Aidoc’s CARE1 foundation model (cleared in February 2025) still the only foundation-model-powered clinical AI to receive clearance. Roughly 10 percent of 2025 AI/ML device clearances incorporated Predetermined Change Control Plans (PCCPs), the mechanism that lets vendors update learning algorithms post-clearance without a fresh 510(k), and the 2026 FDA guidance still does not address AI systems that learn and adapt in real-world deployment. The data underscores how slowly the foundation-model pathway is opening even as the device list keeps growing. Source
9. EU finalizes AI Act Omnibus, pushing high-risk obligations to December 2027
Policy. Following the May 7 political agreement, the EU’s AI Act Omnibus is on track for formal adoption by Parliament and Council by July 2026, codifying a delay of Annex 3 high-risk obligations (employment, education, health insurance) to December 2, 2027, and pushing AI embedded in regulated physical products like medical devices and industrial machinery to August 2028. The package also narrows what counts as “high-risk” to systems whose failure poses genuine health or safety risk and shortens the transparency grace period for AI-generated content to three months. A new prohibition on “nudifier” applications producing non-consensual intimate imagery takes effect December 2, 2026. Source
10. Berkeley Law publishes the full text of its AI policy and exception process
Policy. UC Berkeley School of Law posted the full PDF of its Summer 2026 AI policy this week, spelling out that violations are treated under the existing academic-integrity framework and giving instructors discretion to permit narrow AI use in specific assignments if disclosed in the syllabus. The policy bars uploading course materials to generative AI systems (including readings, slides, and recorded lectures) to keep training data and prompt logs out of third-party hands, and explicitly prohibits AI assistance on take-home and end-of-term exams. The document is already circulating as a template among law-school administrators weighing similar restrictions for the 2026-2027 academic year. Source