AI Architecture Updates: May 17, 2026

1. InfoQ details deterministic replay as the foundation for agent workflows

Leela Kumili on InfoQ. Coverage of Cloudflare Workflows V2 frames deterministic, replay-safe step execution as the architectural primitive that lets long-running AI agents survive retries, partial failures, and operator restarts without duplicating side effects. The piece argues that idempotent step boundaries and built-in execution histories reduce the custom orchestration code architects have been writing around fan-out, fan-in, and multi-step reasoning, while the jump to 50,000 concurrent instances signals that workflow engines are now treated as table stakes for production agent platforms rather than optional infrastructure. For practitioners, the takeaway is that the locus of reliability is shifting from prompt engineering into the workflow substrate underneath the model. Source

2. Simon Willison sketches a tiered governance pattern for multi-tenant LLM cost control

Simon Willison. The release notes for datasette-llm-limits describe a three-plugin chain that separates LLM invocation, accounting, and enforcement into distinct layers, with per-user and global scopes evaluated against rolling time windows. The pattern matters because it generalizes beyond Datasette: it shows how to attach swappable spending policies to any LLM gateway without coupling cost control to the calling code, which is increasingly relevant as teams discover that runaway agent loops are an architectural failure mode rather than a billing problem. Willison’s framing treats budget caps as a first-class architectural concern alongside auth and rate limiting. Source