By using AWS re:Post, you agree to the Terms of Use

Containers

AWS container services offer the broadest choice of services to run your containers and run on the best global infrastructure, with 77 Availability Zones across 24 regions. AWS also provides strong security isolation between your containers, ensures you are running the latest security updates, and gives you the ability to set granular access permissions for every container.

Recent questions

see all
1/18

ECS agent sporadically times out while fetching secrets from SSM Parameter Store

We have an ECS cluster in us-west-2 that runs a few ECS services. We run some ECS tasks that are invoked periodically via EventBridge. All tasks use the EC2 launch type and run on container instances that we manage with an Auto Scaling Group. AMI used currently is amzn2-ami-ecs-hvm-2.0.20220630-x86_64-ebs. Container instances are launched in private subnets and VPC endpoints are set up for a few AWS services, including SSM. A few months ago we started seeing missed checkins from the periodically launched tasks and saw that at least some of them failed to launch due to a timeout from the SSM API endpoint. In ecs-agent's log, it shows up like: > level=error time=2022-09-19T22:30:56Z msg="Failed to create task resource" error="fetching secret data from SSM Parameter Store in us-west-2: RequestError: send request failed\ncaused by: Post \"https://ssm.us-west-2.amazonaws.com/\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" task="..." resource="ssmsecret" > level=info time=2022-09-19T22:30:56Z msg="Setting terminal reason for task" reason="fetching secret data from SSM Parameter Store in us-west-2: Request Error: send request failed\ncaused by: Post \"https://ssm.us-west-2.amazonaws.com/\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" task="..." We tried increasing the throughput of SSM Parameter Store through its settings, but it didn't seem to have an effect. https://docs.aws.amazon.com/systems-manager/latest/userguide/parameter-store-throughput.html Other guides and Q&As I could find were about network misconfigurations that would lead to a complete inability to talk to SSM, whereas the symptom I'm seeing is only intermittent; the ECS tasks get launched without an issue most of the time. https://aws.amazon.com/premiumsupport/knowledge-center/ssm-tcp-timeout-error/ What could be the cause? What else can I look into?
0
answers
0
votes
13
views
asked 6 days ago

Popular users

see all
1/18

Learn AWS faster by following popular topics

1/2