Skip to content

How do I troubleshoot custom resource failures in CloudFormation?

5 minute read
0

I want to resolve custom resources errors in AWS CloudFormation.

Short description

Custom resource failures occur because the AWS Lambda function that's associated with the resource encountered an issue when it ran. The custom resource then sends a FAILED status to CloudFormation.

A custom resource can also fail when CloudFormation doesn't receive a response from the custom resource within the expected timeframe and times out.

To troubleshoot the custom resource issues, either run the AWSSupport-TroubleshootCFNCustomResource runbook, or manually troubleshoot your CloudFormation stack.

For Lambda custom resources, the runbook checks that Lambda reaches the Amazon Simple Storage Service (Amazon S3) to send a response to CloudFormation. The response to CloudFormation checks the Lambda network configuration and security groups.

Resolution

Run the AWSSupport-TroubleshootCFNCustomResource runbook

Before you begin, make sure that your AWS Identity and Access Management (IAM) user or role has the required IAM permissions.

To run the automation, complete the following steps:

  1. Open the AWS Systems Manager console.
  2. In the navigation pane, choose Documents.
  3. In the search bar, enter AWSSupport-TroubleshootCfnCustomResource.
  4. Select the AWSSupport-TroubleshootCfnCustomResource document.
  5. Choose Execute automation.
  6. For the input parameters, enter the following:
    (Optional) AutomationAssumeRole. Enter the ARN of the IAM role that allows Automation, a capability of AWS Systems Manager to perform the actions on your behalf. If you don't specify a role, then Automation uses the permissions of the user that starts the runbook.
    StackName. Enter the name of the CloudFormation stack where the custom resource failed.
  7. Choose Execute.
  8. Review the Outputs section for the following detailed results:
    The validateCloudFormationStack step verifies that the CloudFormation stack exists in the same AWS account and AWS Region.
    The checkCustomResource step analyzes the CloudFormation stack, checks the failed custom resource, and provides information on how to troubleshoot the failed custom resource.

Manually troubleshoot your CloudFormation stack

Check the Amazon CloudWatch logs

Complete the following steps:

  1. Open the CloudFormation console.
  2. Select the failed stack, and then choose the Resources tab to get the physical ID of the Lambda function that's associated with the custom resource.
  3. Select your Lambda function.
  4. Choose the Monitor tab, and then choose View CloudWatch logs.

If the Lambda function was deleted during the CloudFormation rollback, then the log group might still contain the CloudWatch logs.

To get the logs, complete the following steps:

  1. Open the CloudWatch console.
  2. In the navigation pane, choose Log groups.
  3. In the search field, run the following command:
    /aws/lambda/LambdaPhysicalName

Note: Replace LambdaPhysicalName with your Lambda function's name.

If you can't find the logs, then turn of the rollback feature and redeploy the stack to troubleshoot the Lambda function's behavior.

Troubleshoot potential failure causes

Resolve the FAILED status

You might receive the following error message: 

"Received response status FAILED from custom resource. Message returned: <reason here>."

You receive the preceding error message when the Lambda function that's associated with the custom resource encountered an issue and exception handling logic is in place.

To troubleshoot this issue, review reason for failure that's included in the error message and the CloudWatch logs for Lambda.

CloudFormation doesn't receive a response

The stack fails because CloudFormation doesn't receive a response from the custom resource. 

To resolve this issue, take the following actions:

  • Make sure that you're correctly using the cfn-response module in your custom resource's Lambda function to send a response to the CloudFormation stack. 

  • Review the CloudWatch logs to determine whether errors occur when the code runs. 

  • Increase your Lambda function's timeout setting so that the function has enough time to complete the task. The maximum time that you can set is 15 minutes.

  • If your Lambda function is within a virtual private cloud (VPC), then confirm that it's in a subnet. The subnet must allow outbound traffic through a NAT gateway. The subnet must also route to an Amazon S3 endpoint access so that custom resources can access the presigned Amazon S3 URL.

  • If a response was sent after a timeout, then check your Lambda metrics for a high number of concurrent function executions in the same Region. To reduce timeouts, use the reserved concurrency for your function.

  • If your stack is remains in the IN PROGRESS status until the custom resource times out, then use cURL to make a direct HTTP request. When you make a direct request, you potentially prevent the timeout.
    Example curl command:

    curl -H 'Content-Type: ''' -X PUT -d '{    "Status": "SUCCESS",    "PhysicalResourceId": "test-CloudWatchtrigger-1URTEVUHSKSKDFF",
        "StackId": "arn:aws:cloudformation:us-east-1:111122223333:stack/awsexamplecloudformation/33ad60e0-5f25-11e9-a734-0aa6b80efab2
      ",
        "RequestId": "e2fc8f5c-0391-4a65-a645-7c695646739",
        "LogicalResourceId": "CloudWatchtrigger"
      }' 'https://cloudformation-custom-resource-response-useast1.s3.us-east-1.amazonaws.com/arn%3Aaws%3Acloudformation%3Aus-east-1%3A111122223333%3Astack/awsexamplecloudformation/33ad60e0-5f25-11e9-a734-0aa6b80efab2%7CMyCustomResource%7Ce2fc8f5c-0391-4a65-a645-7c695646739?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20170313T0212304Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credential=QWERTYUIOLASDFGBHNZCV%2F20190415%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=dgvg36bh23mk44nj454bjb54689bg43r8v011uerehiubrjrug5689ghg94hb
      '

    Note: To make the request, you must include the details of the request object. You can find the RequestID and the Amazon S3 presigned URL in your CloudWatch logs. For more information, see How do I delete a Lambda-backed custom resource that's stuck in DELETE_FAILED status or DELETE_IN_PROGRESS status in CloudFormation?

Related information

Running a simple automation (console)

Setting up Automation

Systems Manager Automation runbook reference

What are some best practices to implement Lambda backed custom resources with CloudFormation?