How do I troubleshoot pipeline failures in EC2 Image Builder?

6 minute read
0

I want to troubleshoot pipeline failures in EC2 Image Builder.

Short description

If your Image Builder pipeline fails, then an error appears that both returns a workflow execution ID and describes the reason your pipeline failed:

"Workflow Execution ID: example-workflow-id failed with reason: example-reason"

Each build and test stage of the process has an associated workflow. Container images have an additional workflow that runs during the distribution. Also, each workflow has steps that are required to build your golden Amazon Machine Image.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

To troubleshoot your Image Builder pipeline failure, complete the following steps:

Check the workflow stage and step that caused your pipeline failure

To check the workflow stage and step that caused your pipeline failure, use one of the following methods:

Use APIs
To check the workflow stage and step that caused your pipeline failure, use the GetWorkflowExecution and ListWorkflowStepExecution API commands with your workflow execution ID.

Use the AWS Management Console
To use the AWS Management Console to check the workflow stage and step that caused your pipeline failure, complete the following steps:

  1. Open the Image Builder console.
  2. From the left pane, select Image Pipelines. Then, choose your pipeline.
  3. Under Output Images, choose the Output Image Version that failed.
  4. Under Workflow, select the workflow dropdown list, and then choose the failed stage. The list of steps appears for the failed stage.

When you identify the workflow stage and step that caused your pipeline failure, use the following methods to further troubleshoot.

Check the logs that are sent to CloudWatch Logs

Image Builder published detailed workflow logs to the /aws/imagebuilder/ImageName log group and the ImageVersion/ImageBuildVersion log stream. You can check these log streams from the Amazon CloudWatch console or the Image Builder console.

To check the logs that are sent to CloudWatch Logs from the Image Builder console, complete the following steps:

  1. Open the Image Builder console.
  2. From the left pane, select Image Pipelines. Then, choose your pipeline.
  3. Under Output Images, choose the Output Image Version that failed.
  4. Under Workflow, select the workflow dropdown list, and then choose the failed stage.
  5. Select the Step ID of the failed step.
  6. Select Application logs.
  7. Review the Application logs for further troubleshooting.

Note: If you can't troubleshoot your issues after you complete the preceding steps, then continue with the next methods.

Check the logs that are sent to Amazon S3

Amazon Simple Storage Service (Amazon S3) logs show the steps and errors for Amazon Elastic Compute Cloud (Amazon EC2) instance activity during the build process. Amazon S3 logs include log outputs from the component manager, component definitions. These logs provide detailed JSON output of the steps taken on the Amazon EC2 instance.

If you specified an Amazon S3 bucket name and key prefix in your infrastructure configuration, then you can find the workflow step log path at S3://S3BucketName/KeyPrefix/ImageName/ImageVersion/ImageBuildVersion/WorkflowExecutionId/StepName

If you didn't specify an Amazon S3 bucket name and key prefix, then open the Image Builder console and navigate to the Troubleshooting settings. Then, under Logs, specify an Amazon S3 bucket that your Image Builder can write the logs to. After you specify the Amazon S3 bucket, run the pipeline again to collect and store the logs to the bucket.

Check the Amazon EC2 instance logs

AWS Task Orchestrator and Executor component manager (AWSTOE) creates log folders on the instances that are used to build and test new images. These log folders are created each time that your component runs. For container images, the log folder is stored in the container.

Note: By default, Image Builder shuts down the build or test instance when the pipeline build fails. To retain your build or test instance for troubleshooting, modify the instance settings of your pipeline's infrastructure configuration resource. To modify the instance settings, open the Image Builder console and navigate to the Troubleshooting settings of your infrastructure configuration resource. Then, deactivate the Terminate instance on failure option.

Also, you can use the update-infrastructure-configuration command to modify the instance settings. Make sure that you set the TerminateInstanceOnFailure value to false. For more information, see Update an infrastructure configuration.

Connect to the build or test instance and locate the runtime logs at the following locations:

  • For Linux, check the /var/lib/amazon/toe/file_prefix_runtime_executionID directory.

  • For Windows, check the env:ProgramFiles\Amazon\TaskOrchestratorAndExecutor\file_prefix_run time_executionID directory.

    Example log file: /var/lib/amazon/toe/TOE_2021-07-01_12-34-56_UTC-0_a1bcd2e3-45f6-789a-bcde-0fa1b2c3def4

In the preceding log directories, check the following files to further troubleshoot your pipeline failure:

  • application.log - Includes timestamped debug level information from AWSTOE that's related to activity that occurs when the component runs.
  • detailedoutput.json - Includes detailed information about run status, inputs, outputs, and failures for all documents, phases, and steps that apply to the component when it runs.
  • console.log - Includes standard out (stdout) and standard error (stderr) information that AWSTOE writes to the console when the component runs.
  • chaining.json - Represents optimizations that AWSTOE applies to resolve chaining expressions.

Check CloudTrail events

If AWS CloudTrail is activated in your account, then all build activity is logged. To filter CloudTrail events, filter by the source: imagebuilder.amazonaws.com, by the username: Image Builder, or search for the associated Amazon EC2 instance ID. These filters return logs that show further details about your pipeline.

To troubleshoot API trails during the distribution process, filter CloudTrail events by the username: Ec2ImageBuilderIntegrationService to troubleshoot API trails during the Distribution process. Use this method to retrieve information related to failed API calls made by Image Builder.

Related information

How do I troubleshoot build pipeline timeout errors in EC2 Image Builder?

How do I troubleshoot a FAILED lifecycle policy or a policy that completed, but images are still available in my EC2 Image Builder lifecycle policy?

How do I resolve the 403 Access Denied error for my image build pipeline in EC2 Image Builder?

AWS OFFICIAL
AWS OFFICIALUpdated 2 months ago