AI Architecture Updates: May 20, 2026
1. Zen van Riel codifies five production patterns that put architecture ahead of model choice
Zen van Riel. A new field guide argues that AI system reliability now hinges on five repeatable patterns rather than which foundation model a team picks: a request orchestration layer that routes, transforms, and rate-limits calls; a tiered model strategy that sends easy traffic to cheap models and reserves frontier capacity for hard reasoning; a streaming-first transport that uses SSE or WebSockets to cut perceived latency and enable early stopping; explicit context budgeting with compression, retrieval, and caching; and graceful degradation through timeout cascades, fallbacks, feature flags, and circuit breakers. The post pegs combined cost savings at 60-70% from orchestration plus tiering, and pushes back on the common instinct to over-engineer by recommending teams start simple and only add layers when evidence demands them. The framing makes the architecture itself, not the model, the load-bearing decision. Source