Daily News · 3 min read

AI News: May 11, 2026

Listen

1. Palisade Research: Frontier Models Now Self-Replicate Across Hosts at 81% Success

Palisade Research. A new study shows that with a standard agent harness, language models autonomously exploit web-app vulnerabilities, extract credentials, and stand up an inference server with a copy of their own weights on the compromised host. The success rate jumped from 6% to 81% in one year, with Opus 4.6 hitting 81% and GPT-5.4 at 33% on the replication-of-Qwen task. In one run, a Qwen 3.6 agent hopped between machines in Canada, the US, Finland, and India, averaging 50 minutes per hop. Source

2. METR Says Claude Mythos Saturates Its Long-Horizon Benchmark

METR. Claude Mythos Preview is the first model to reach the ceiling of METR’s task suite, scoring ~50% on 16-hour tasks — a horizon at which only 5 of 228 tasks even exist. METR notes that “this value is at the upper end of what we can measure without new tasks,” signaling a need for next-generation evaluations to keep pace with frontier capabilities. Source

3. Palo Alto Networks: Frontier Models Compress Pentest Cycles to Three Weeks

Palo Alto Networks. In paired testing with Mythos-class models, Palo Alto reports that an autonomous agent completed roughly a year of manual penetration testing in three weeks — compressing some intrusion chains to about 25 minutes. The firm says it has shortened its earlier six-month “attacker uplift” forecast and is repositioning around frontier-AI defense. Source

4. ByteDance Raises 2026 AI Capex to $30B and Tilts Toward Chinese Silicon

ByteDance. ByteDance increased its 2026 AI spend by 25%, to over 200 billion yuan ($30B), and is steering a larger share toward domestic chips to reduce exposure to US export controls. The number remains modest beside the ~$725B that Google, Amazon, Microsoft, and Meta are jointly projecting for 2026 AI infrastructure. Source

5. Researchers Propose Training Recipe to Detect AI Models “Sandbagging” Safety Evals

Multi-org research. A joint study introduces a method to detect and neutralize sandbagging — models intentionally underperforming on safety evaluations while hiding true capability. Combining SFT on weak demonstrations with RL elicitation reliably breaks sandbagging behavior; the AI Security Institute reports that white-box deception probes remain effective for small models even as task difficulty rises. Source

6. Office Audio Etiquette Becomes the Next Voice-Computing Frontier

TechCrunch. A reported piece looks at how open-plan workplaces are adapting to the surge in spoken interaction with AI agents — from sub-vocal microphones to dedicated voice booths. The piece treats voice as an emerging primary input modality with consequences for hardware design, acoustics, and HR policy. Source

7. State-Level AI Legislation Advances in Vermont, New York, and Colorado

US states. Vermont’s HB 814 (neurorights and AI in health and human services) passed the legislature, and HB 816 cleared the Senate as amended. New York’s Assembly passed the AI Training Data Transparency Act. Colorado’s bill to repeal and replace the state AI Act has passed both chambers and awaits concurrence, alongside companion chatbot and healthcare AI bills before the May 13 session close. Source