AWS AI Updates: May 22, 2026

1. SageMaker AI inference endpoints now speak the OpenAI API protocol

AWS. SageMaker Inference endpoints now accept OpenAI-compatible requests, so code written against the OpenAI SDK, LangChain, or Strands Agents can target a SageMaker endpoint by swapping the base URL and using AWS credentials with automatic token refresh. The change lets teams keep their existing client and streaming logic while running open source or fine-tuned models on their own GPU instances inside their VPC, with custom auto-scaling policies attached. It is live in 14 regions including us-east-1, us-east-2, us-west-2, eu-west-1, eu-central-1, eu-west-2, ap-northeast-1, ap-northeast-2, ap-southeast-1, ap-southeast-2, ap-southeast-3, ap-south-1, sa-east-1, and ca-central-1. Source

2. Bedrock InvokeModel APIs gain request-level usage attribution

AWS. Bedrock now lets callers attach metadata tags (team, application, environment, experiment) to individual InvokeModel and InvokeModelWithResponseStream requests, bringing those APIs to parity with Converse and ConverseStream which have supported request metadata since launch. To use it, customers turn on model invocation logging in the region, add the metadata fields on each request, then query the resulting logs to break down spend and consumption by tag without standing up extra infrastructure for chargeback or cost allocation. The feature is available in all AWS commercial regions where Bedrock runs and is exposed through Bedrock model invocation logs rather than as a separate billing surface. Source