ECS Fargate Task STOPPED with reason Timeout waiting for EphemeralStorage provisioning to complete. and code TaskFailedToStart


Hello Community,

As per the subject, I am getting this Error time to time from (not producible) ECS Fargate Task and container don't start.

These containers are programmatically initiated with the following attribute.

... ecsTaskConfig.overrides.ephemeralStorage = { sizeInGiB: 21 }; ...

I tried to find the solution, but so far no luck, Well I thought about implementing a background job to check if the initiated task started or not, but looking for some better solution.

Any tip/guidance would be helpful.

Thanks, Faiz

1 Answer

Well, didn't find any recommended approach yet.

So, I resolve this issue as follows:

  1. When the container initiated from the code, maintaining the taskId reference in db.
  2. EventBridge to capture all the container events from the required ECS cluster and send to Lambda function.
  3. At lambda function, capturing the required stopReason and calling the API call for reprocess.
  4. Because the lambda function is firing multiple times with the same taskId (actually taskArn) so at the API level maintaining the reprocess counters.

Do let me know if is there any better way to do it.

answered 14 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions