- Newest
- Most votes
- Most comments
Hi,
You may want to consider AWS Batch instead of a Fargate container to execute your jobs.
see doc at https://aws.amazon.com/batch/
As its name implies, it was expressely built to execute one-off autonomous jobs. Fargate is more specific to long-lasting server tasks answering to multiple clients.
BTW, AWS Batch is built on top of Fargate but is much simpler to implement for jobs: that is the option that I use (heavily !!) for my long jobs.
Some good starting points:
- https://docs.aws.amazon.com/batch/latest/userguide/get-set-up-for-aws-batch.html
- https://stackify.com/aws-batch-guide/
- https://www.youtube.com/watch?v=k7r6i3x5d7Q
Best,
Didier
The error message you encountered, indicates a problem with the Elastic Network Interface (ENI) associated with your ECS task on Fargate. Here are some steps to troubleshoot:
Task Definition:
networkMode should be awsvpc. Subnets and Security Groups:
Verify configurations in the task definition. IAM Role:
Ensure task execution role has required permissions. Service Limits:
Check EC2 and VPC limits. AWS VPC Flow Logs:
Enable VPC Flow Logs to capture IP traffic information. Example IAM Policy for Task Execution Role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:CreateLogGroup",
"ecs:DescribeTasks",
"ecs:DescribeTaskDefinition",
"ec2:DescribeNetworkInterfaces",
"ec2:CreateNetworkInterface",
"ec2:DeleteNetworkInterface"
],
"Resource": "*"
}
]
}
Thanks for the guide. However, as I follow your instruction, the same error still persists.
Solved! I found out the answer to the above issue with Fargate which is that you have to set up correctly service role and execution role (cloudwatch logs) so that you know which particular errors ocurred. Then whether you run it via ECS Fargate or Batch would lead to the same outcome.
Relevant content
- asked 3 years ago
- AWS OFFICIALUpdated 9 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 months ago
Thanks for your suggestion, I shifted to Batch and managed to run the job with "succeeded" status and exit code "0" but the application does not seem to run properly as the expected result. My script runs well locally though :/
Hi, can you elaborate (error messages, etc.) on the issue that your script faces? For such script, my recommendation is to add lots of CloudWatch logs to understand what happens at all stages of your code. If you're using Python, packages like loguru (https://github.com/Delgan/loguru) help in easily writing such logs
Hi, the error turns out to be the credentials stored in secrets manager. Not sure what's wrong... I have configured the permission to retrieve credentials from SM correctly and gave it to execution role (batch).
Traceback (most recent call last): 2024-08-07T17:12:16.479+02:00 File "/var/task/bundesliga_update_ecs.py", line 201, in <module> 2024-08-07T17:12:16.479+02:00 main() 2024-08-07T17:12:16.479+02:00 File "/var/task/bundesliga_update_ecs.py", line 156, in main 2024-08-07T17:12:16.479+02:00 secret = get_secret("GOOGLE_APPLICATION_CREDENTIALS") 2024-08-07T17:12:16.479+02:00 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-08-07T17:12:16.479+02:00 File "/var/task/bundesliga_update_ecs.py", line 17, in get_secret 2024-08-07T17:12:16.479+02:00 get_secret_value_response = client.get_secret_value(SecretId=secret_name) 2024-08-07T17:12:16.479+02:00 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-08-07T17:12:16.479+02:00 File "/usr/local/lib/python3.12/site-packages/botocore/client.py", line 565, in _api_call 2024-08-07T17:12:16.479+02:00 return self._make_api_call(operation_name, kwargs) 2024-08-07T17:12:16.479+02:00 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-08-07T17:12:16.479+02:00 File "/usr/local/lib/python3.12/site-packages/botocore/client.py", line 999, in _make_api_call 2024-08-07T17:12:16.480+02:00 http, parsed_response = self._make_request( 2024-08-07T17:12:16.480+02:00 ^^^^^^^^^^^^^^^^^^^ 2024-08-07T17:12:16.480+02:00 File "/usr/local/lib/python3.12/site-pa
The error is due to wrong syntax to retrieve the secrets (string instead of json fornat)...