How to communicate with a single ECS Container from a server less environment

0

Dear Community,

Please imagine the following scenario:

  • I have multiple long running computation tasks. I'm planning to package them as container images and use ECS Tasks to run them.
  • I'm planning to have a server less part for administrating the tasks

Once a computation tasks starts, it takes the input data from a SQS queue and can start its computation. Also all results end up in an SQS queue for storage. So far, so good. Now the tricky bit: The computation task needs some human input in the middle of its computation, based on intermediate results. Simplified, the task says "I have the intermediate result of 42, should I resume with route A or route B?". Saving the state and resuming in a different container (based on A or B) is not an option, it just takes too long. Instead I would like to have a server less input form, which sends the human input (A or B) to this specific container. What is the best way of doing it?

My idea so far: Each container creates his own SQS queue and includes the url in his intermediate result message. But this might result in many queues. Also potentially abandoned queues, should a container crash. There must be a better way to communicate with a single container. I have seen ECS Exec, but this seams more build for debugging purposes.

3 Answers
1

The SQS option is a good one. A way to work around the abandoned queue issue is to write the names of the queues into a DynamoDB table with a TTL. Every so often, the container updates the TTL. If the container crashes, the record will eventually be deleted, which will show up in a DynamoDB stream, which can be used to delete the queue.

An alternative might be to expose and HTTP endpoint directly on the ECS task itself, and send the URL with the request. When the use input is received, you make a call to the specific URL. This will make the solution easier.

I am not sure I understand that saying that saving the state and resuming from a different container takes too long. What do you mean? The time it takes to save and restore? If so, don't forget there is a human in the loop which will take much longer probably. Also, while the container is waiting for the result, it doesn't do anything, which means that you are just idling there, wasting money. I would check again the option to make it event driven and resume in a different container.

Finally, I would recommend looking into breaking the process into smaller tasks and use StepFunction to orchestrate them, instead of an ECS task. You can then use the Wait For Task Token to get the human's response.

profile pictureAWS
EXPERT
Uri
answered a year ago
  • Thanks! Is it possible to reach a single container via http if multiple containers are running on a single host machine?

  • Not a container expert, but should be possible.

0

This seems like orchestration, have you considered Step Functions ? You can Invoke and Manage ECS Tasks.

https://docs.aws.amazon.com/step-functions/latest/dg/connect-ecs.html

profile pictureAWS
Roly
answered a year ago
  • I see the similarities, but as long as the task is not splittable, I need to pass information into an already running container, not only start-parameters.

0

While this isn't exactly what you were asking, as a generic way of getting interactive access to ECS tasks you could use ECS exec. It works for both EC2 and Fargate. And using Copilot makes it easier. https://aws.amazon.com/blogs/containers/new-using-amazon-ecs-exec-access-your-containers-fargate-ec2/ https://aws.amazon.com/blogs/containers/connecting-to-an-interactive-shell-on-your-containers-running-in-aws-fargate-using-aws-copilot/

profile picture
EXPERT
Kallu
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions