Anthropic AI Updates: July 3, 2026

1. Anthropic Details Fable 5 Cyber Safeguards and a Jailbreak Severity Framework

Anthropic. Anthropic published more detail on the safety classifiers deployed with Claude Fable 5, which sort cybersecurity requests into four tiers ranging from prohibited uses such as ransomware and malware development to benign defensive work like secure coding and incident response. The classifiers use an intentionally enlarged safety margin that blocks some legitimate requests to maximize confidence in preventing harmful outputs. Anthropic also proposed an industry Cyber Jailbreak Severity scale, CJS-0 through CJS-4, that scores jailbreaks across capability gain, breadth of capability, ease of weaponization, and discoverability, and it opened a HackerOne program for researcher submissions. Source

2. Anthropic Expands Admin Visibility and Spend Controls for Claude Enterprise

Anthropic. Anthropic introduced administrative analytics and cost controls for Claude Enterprise aimed at managing the spend patterns of agentic work across organizations. New analytics surface usage and cost breakdowns by group and user, Claude Code insights covering active developers and session counts, a natural-language Analytics Chat, and an Analytics API for programmatic access to tools like Datadog and CloudZero. Cost controls include model defaults and entitlements that restrict expensive models from auto-selection, spend-threshold alerts at 75 and 90 percent of limits, individual user spend visibility, and an Admin API for automating cost workflows at scale. Source