Patch Group Detailed Status Terminated

0

I setup Patch Manager Patch Groups for our managed instances and EC2 instances. There are four patch groups all together and each group houses ~40 instances. If I run a scan / install on the hosts in the group, I get expected results, however when letting the maintenance window run the patch group, I get a majority of the hosts come back with status: failed / detailed status: terminated. There is no output to review, and looking on the hosts the command ID doesn't even show up in the documents folder. Also, reviewing the logs folder on for the SSM agent on the hosts, everything appears clean. I checked with the maintenance window and validated that we allow unregistered targets, so there is no miss there. I have also reviewed the commands being sent out to the host at the time of the maintenance window and there are no additional AWS-RunPatchBaseline commands going out at the same time. Being that the command never hits the host and that I can send the AWS-RunPatchBaseline manually properly, I would assume its something with the maintenance window, but I cant seem to figure out what as there doesn't seem to be much to it. We do have another command that goes out 2 hours AFTER the patch group maintenance window ends that hits all the instances with just a scan and that always returns successful without any failures.

asked 2 years ago396 views
1 Answer
0

A detailed status of "Terminated" in SSM Run Command basically means that the command exceeded its max-errors limit and subsequent command invocations were canceled by the system.

(-) https://docs.aws.amazon.com/systems-manager/latest/userguide/monitor-commands.html#monitor-about-status

There is a concept of Rate control in Run Command (AWS-RunPatchBaseline is a Run Command document) through which you can control the error and rate at which commands are sent to managed nodes in a group (concurrency controls and error controls)

(-) https://docs.aws.amazon.com/systems-manager/latest/userguide/send-commands-multiple.html#send-commands-rate

Example:

You are patching 10 managed nodes and have configured "Error threshold" field to be 50%. If the command fails on 5 managed nodes then the system stops sending the command to additional nodes as soon as the 6th error is received and the detailed status will be "Terminated" for these remaining nodes.

In your case, it is possible that the "Error threshold" field in the Maintenance Window is set to a lower value and as soon as the patching operation fails on a small number of nodes, the system cancels the invocation to the rest of the managed nodes and puts them in "Terminated" state. Hence, you do not see any output or anything in logs since the command was never sent to the node.

AWS
SUPPORT ENGINEER
answered a year ago
profile picture
EXPERT
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions