We have a Java Spring Boot ECS Fargate Scheduled Task that has been running every hour for many months with no issues. On April 8, we noticed that our containers stopped terminating themselves on failure, and we racked up a decent bill once we noticed that we had hundreds of "running" containers that were simply doing nothing. I've looked at the logs and while we did see a spate of failures recently, we've seen similar/identical errors previously but with containers that terminated themselves. These tasks are connecting to Redis and getting an OOM error (which still needs looked into as well), but I think that problem simply surfaced this other much more costly problem.
Has there been a change in behavior for scheduled tasks?
Turns out, we had added a service that was holding a thread open and preventing the rest of the container from terminating.