AWS AI Updates: May 20, 2026

1. SageMaker HyperPod adds native inference data capture for EKS-based clusters

Amazon SageMaker HyperPod. HyperPod now records inference request and response payloads asynchronously to S3 without blocking production traffic, removing the need to build custom logging pipelines for model monitoring, audit, and drift detection. Capture can be configured at the endpoint, load balancer, or model pod level (or layered across all three), with configurable sampling rates and customer-managed KMS keys for encryption. The feature ships with both the HyperPod Inference Operator and SageMaker JumpStart, but only on clusters using the EKS orchestrator, which leaves Slurm-based HyperPod deployments to keep rolling their own capture. Source

2. SageMaker Studio lets JupyterLab and Code Editor reserve GPU capacity through Flexible Training Plans

Amazon SageMaker Studio. Studio’s JupyterLab and Code Editor IDEs can now consume GPU capacity reserved through SageMaker Flexible Training Plans (FTP), giving interactive notebook and editor sessions the same predictable access to scarce accelerator types that training jobs already had, with up to 65% off On-Demand pricing. Users pick instance type, reservation length, and start date in the FTP console, then select the purchased plan from the instance dropdown when launching a Studio app, with SageMaker handling provisioning and sending proactive expiration notifications. The change closes a long-standing gap where teams could reserve GPUs for batch training but had to fight for On-Demand capacity every time a developer opened a notebook on the same hardware. Source