How do I troubleshoot Systems Manager Run Command timeout issues?

3 minute read
0

I used AWS System Manager Run Command to run an SSM document on my managed Amazon Elastic Compute Cloud (Amazon EC2) instance. However, the process failed with a timeout error.

Short description

Run Command timeout status details include the following:

  • Execution timeout: The time, in seconds, for a command to complete before it is considered to have failed. The default is 3600 (1 hour). The maximum value is 172800 (48 hours).
  • Delivery timeout: The command wasn't delivered to the managed node before the total timeout expired.
  • Total timeout: The value of the delivery timeout plus the execution timeout. If the execution timeout isn't required by the SSM document, then total timeout is equal to the delivery timeout plus the default execution timeout.

For more information, see Understanding command statuses.

Resolution

Review Run Command status details

  1. Open the Systems Manager console.
  2. From the navigation pane, choose Run Command.
  3. Choose the hyperlinked Command ID to open the Command status page.
  4. From the Targets and outputs section, choose the hyperlinked Instance ID, and then review the output.

When the output is truncated, connect to the EC2 instance using SSH, and then navigate to the following directories to see the full error details. Note the exit status codes, and then see Troubleshooting Systems Manager Run Command for additional troubleshooting steps.

For Linux and macOS:

/var/lib/amazon/ssm/<instance-id>/document/orchestration/<command-id>/<Plugin-name>/<Step-name>/stdout
/var/lib/amazon/ssm/<instance-id>/document/orchestration/<command-id>/<Plugin-name>/<Step-name>/stderr

For Windows:

%ProgramData%\Amazon\SSM\InstanceData\<ManagedInstance-ID>\document\orchestration\<Command-ID>\<plug-in>\<step_number.plug-in>\stdout
%ProgramData%\Amazon\SSM\InstanceData\<ManagedInstance-ID>\document\orchestration\<Command-ID>\<plug-in>\<step_number.plug-in>\stderr

Review SSM Agent logs

Review the SSM Agent logs for more details about the failure.

For Linux and macOS, locate the logs in the following directories:

/var/log/amazon/ssm/amazon-ssm-agent.log
/var/log/amazon/ssm/errors.log
/var/log/amazon/ssm/audits/amazon-ssm-agent-audit-YYYY-MM-DD

For Windows, locate the logs in the following directories:

%PROGRAMDATA%\Amazon\SSM\Logs\amazon-ssm-agent.log
%PROGRAMDATA%\Amazon\SSM\Logs\errors.log
%PROGRAMDATA%\Amazon\SSM\Logs\audits\amazon-ssm-agent-audit-YYYY-MM-DD

If the SSM Agent logs don't provide the information that you need to resolve the error, then allow debug logging to reproduce the issue.

Troubleshoot timeout issues

  • Make sure that your Run Command SSM document total time to complete is less than the timeoutSeconds property. and verify that the total time required to complete them is less than the timeoutSeconds parameter. The default timeoutSeconds property value is 3600s seconds (1 hour). For more information on specifying the timeoutSeconds property value, see Handling timeouts in runbooks.
  • The EC2 instance must be displaying as a managed node and the SSM Agent ping status must be Online. If your EC2 instance isn't displaying as a managed node or the SSM Agent ping status isn't Online, additional troubleshooting is required. For more information, see Why is my EC2 instance not displaying as a managed node or showing a "Connection lost" status in Systems Manager?
  • If your Run Command is running scripts that reboot managed nodes, the node might disconnect causing timeout issues. Make sure that you are using the correct exit codes. For more information, see Handling reboots when running commands.
  • If your SSM Agent version is 2.0.913 or higher, then the maximum execution timeout value is 172800 seconds (48 hours). Verify that the instance is using the latest version of SSM Agent.
  • When Maintenance Window or State Manager runs the command, confirm that the command is running. To confirm, use AWS CloudTrail to review the SendCommand response.

Related information

AWS Systems Manager documents

Setting up AWS Systems Manager

How do I troubleshoot Systems Manager Run Command failures?

AWS OFFICIAL
AWS OFFICIALUpdated a year ago