Logs Missing from CloudWatch when Stopping ECS Container.

0

I have a java code (maven build) to create a json file and upload the file to S3 and after each upload I record a log message with file name. The logs are getting stored in CloudWatch and I am using "awslogs" as the log driver. When I run the code in ECS as a Docker Container the files start uploading to the S3 and for each file I can see a log message but when I stop the container by changing the desired count for the task to 0 then for the last file in S3 I am not able to see the log message otherwise for all other files in S3 there is a log message. What could be the reason for this gap between logs and S3 data ?

1 Answer
0

Hello,

Thank you for providing all the details. From it, I could see this: (...) upload the file to S3 and after each upload I record a log message with file name. (...).

I'm assuming that your service doesn't have a Loadbalancer in front of it. Thus, the ECS will send the SIGTERM signal to your task and 30 seconds after (by default) it will send a SIGKILL command if the application doesn't stop by itself. It looks like that the process is being stopped before the log message be recorded.

I'd recommend you to check the stopTimeout configuration. It could be configured using either the Task definition parameters or ECS Agent configuration using ECS_CONTAINER_STOP_TIMEOUT (for EC2 only). Fargate has a limit of 120 seconds and there is no limit for EC2.

In order to help you more, I'm sharing this blog post that has some code snippets for SIGTERM handling on different languages. You can use something similar to make sure that the S3 file is completely copied and logs are shipped before stop the process.

Talking more about CloudWatch Logging driver, it will use the blocking mode by default. Which means that, whenever there is something to send to the CW Logs, it will stop the main process to send the logs. This guarantee that the logs will be delivered without miss them. You can also have the non-blocking mode configured which will buffer the logs in the container internal memory and send to the endpoint whenever is possible (best effort basis). You can check which is the mode configured into your logConfiguration. Please, find the further information about it on this documentation.

Hope this helps you to solve the issue.

profile pictureAWS
answered 9 months ago
  • Thanks for Your Response !! I would like to Add that -

    1. I have already tried using stopTimeout configuration in the Task Definition for the ECS Fargate Task but it is not working and even after using the configuration it is still shutting down in 1-2 sec only.
    2. Also I added the Sample code provided by AWS to Handle SIGTerm Signal but that code is getting failed in CodeBuild only.
    3. I am using blocking mode only for the logs. PS - The way I am terminating the task is by changing desired task count to 0 and then checking the time taken to show the number of task as 0 under the health tab in the ECS Dashboard. It generally takes 1-2 sec to show current task from 1 to 0, So that way I know that task ended in 1-2 sec rather than waiting for 60 sec time duration specified under stopTimeout configuration.
  • It looks like the app is getting shutdown as soon as there is a SIGTERM signal. You can test it from a local environment sending the SIGTERM to the process and checking if it will be terminated immediately. In order to have the stopTimeout working, the application shouldn't terminate at SIGTERM signal. The stopTimeout is the time that the ECS Agent will wait to send the SIGKILL signal. If the application is terminating on SIGTERM, the stopTimeout is not triggered. I'd check the app signal handling to fix this issue.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions