How do I use the AWS CDK Provider Framework for long-running custom resource operations?
I have a custom resource that runs an AWS operation that exceeded the AWS Lambda timeout. I want to use the AWS Cloud Development Kit (AWS CDK) Provider Framework to manage the long-running operation.
Short description
AWS CDK custom resource providers that run in Lambda functions have a maximum timeout of 15 minutes. To manage operations that exceed 15 minutes, use the built-in asynchronous operation support from the AWS CDK Provider Framework. For example, you can use the Provider Framework for an AWS Step Functions execution, a database migration, or a machine learning training job.
To use the Provider Framework, you must create two Lambda functions. Then, configure the functions with the Provider construct in your AWS CDK stack. The onEvent handler starts the operation and returns a PhysicalResourceId value. The isComplete handler polls the operation status until it completes or times out.
Resolution
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version. The following resolution uses the Provider Framework to manage a Step Functions execution as a long-running operation.
Prerequisites:
-
Configure AWS CLI credentials for the AWS account and AWS Region where you intend to deploy. The AWS CDK uses the default AWS CLI profile unless you specify a different one. To use a specific profile or Region, configure the AWS_PROFILE and AWS_REGION environment variables. Or, add --profile profile-name to each AWS CDK command. For more information, see Configuration and credential file settings in the AWS CLI.
-
Install Node.js 22.x or later. To download the tool, see Download Node.js on the Node.js website.
-
Create a TypeScript CDK application. If you don't have an existing application, then run the following command to create an empty directory, switch to the directory, and initialize a new project:
mkdir my-async-resource && cd my-async-resource && cdk init app --language typescript -
Run the following command from the root of your AWS CDK project to add the @aws-sdk/client-sfn package for TypeScript type-checking during development:
npm install --save-dev @aws-sdk/client-sfn
Create the onEvent handler
In your AWS CDK project directory, create a file that's named handlers/on-event.ts with the following code:
import { SFNClient, StartExecutionCommand } from '@aws-sdk/client-sfn'; const sfn = new SFNClient({}); export async function onEvent(event: any) { console.log('Event:', JSON.stringify(event, null, 2)); const stateMachineArn = process.env.STATE_MACHINE_ARN; if (!stateMachineArn) { throw new Error('STATE_MACHINE_ARN environment variable is not set.'); } if (event.RequestType === 'Create' || event.RequestType === 'Update') { const command = new StartExecutionCommand({ stateMachineArn: stateMachineArn, input: JSON.stringify({ requestId: event.RequestId, resourceProperties: event.ResourceProperties }) }); const response = await sfn.send(command); return { PhysicalResourceId: response.executionArn, Data: { ExecutionArn: response.executionArn } }; } if (event.RequestType === 'Delete') { return { PhysicalResourceId: event.PhysicalResourceId }; } throw new Error(`Unknown request type: ${event.RequestType}`); }
Note: The preceding handler code starts a Step Functions execution.
Create the isComplete handler
In your AWS CDK project directory, create a file that's named handlers/is-complete.ts with the following code:
import { SFNClient, DescribeExecutionCommand } from '@aws-sdk/client-sfn'; const sfn = new SFNClient({}); export async function isComplete(event: any) { console.log('IsComplete Event:', JSON.stringify(event, null, 2)); if (event.RequestType === 'Delete') { return { IsComplete: true }; } const executionArn = event.PhysicalResourceId; const command = new DescribeExecutionCommand({ executionArn: executionArn }); const response = await sfn.send(command); if (response.status === 'SUCCEEDED') { return { IsComplete: true, Data: { ExecutionArn: executionArn, Status: response.status, Output: response.output } }; } if (response.status === 'FAILED' || response.status === 'TIMED_OUT' || response.status === 'ABORTED') { throw new Error(`Execution failed with status: ${response.status}`); } return { IsComplete: false }; }
The preceding code uses AWS CloudFormation to check the status of the operation. When the operation succeeds, the isComplete handler returns the {IsComplete: true} value. If the operation is still running, then the value of isComplete is false. If the operation fails, then you receive an error message.
Configure the Provider construct in your AWS CDK stack
The Provider construct connects the two handlers.
To configure the Provider construct, add the following code to your AWS CDK stack file, such as lib/my-stack.ts:
import * as cdk from 'aws-cdk-lib'; import * as cr from 'aws-cdk-lib/custom-resources'; import * as nodejs from 'aws-cdk-lib/aws-lambda-nodejs'; import * as lambda from 'aws-cdk-lib/aws-lambda'; import * as sfn from 'aws-cdk-lib/aws-stepfunctions'; import * as path from 'path'; import { Construct } from 'constructs'; export class AsyncCustomResourceStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props); // Step Functions state machine that represents the long-running operation. // Replace this example with your own state machine, or import an existing // one using sfn.StateMachine.fromStateMachineArn(). const stateMachine = new sfn.StateMachine(this, 'WorkflowStateMachine', { definitionBody: sfn.DefinitionBody.fromChainable( new sfn.Wait(this, 'WaitStep', { time: sfn.WaitTime.duration(cdk.Duration.minutes(5)) }) ), timeout: cdk.Duration.hours(1) }); const onEventHandler = new nodejs.NodejsFunction(this, 'OnEventHandler', { runtime: lambda.Runtime.NODEJS_22_X, handler: 'onEvent', entry: path.join(__dirname, 'handlers', 'on-event.ts'), environment: { STATE_MACHINE_ARN: stateMachine.stateMachineArn }, timeout: cdk.Duration.minutes(2), bundling: { externalModules: ['@aws-sdk/*'] } }); const isCompleteHandler = new nodejs.NodejsFunction(this, 'IsCompleteHandler', { runtime: lambda.Runtime.NODEJS_22_X, handler: 'isComplete', entry: path.join(__dirname, 'handlers', 'is-complete.ts'), timeout: cdk.Duration.minutes(2), bundling: { externalModules: ['@aws-sdk/*'] } }); // Grant permissions to the handlers. // grantStartExecution() grants states:StartExecution to the onEvent handler. stateMachine.grantStartExecution(onEventHandler); // grantRead() grants states:DescribeExecution (and related read actions) // to the isComplete handler so it can poll execution status. stateMachine.grantRead(isCompleteHandler); const provider = new cr.Provider(this, 'AsyncProvider', { onEventHandler: onEventHandler, isCompleteHandler: isCompleteHandler, queryInterval: cdk.Duration.seconds(30), totalTimeout: cdk.Duration.hours(2) }); new cdk.CustomResource(this, 'AsyncResource', { serviceToken: provider.serviceToken, properties: { Message: 'This triggers an async operation' } }); } }
Note: To change how often CloudFormation polls the isComplete handler, update the value for queryInterval. To change the maximum wait time, update the value for totalTimeout. If your handlers import an @aws-sdk/* client that isn't included in your runtime, then remove '@aws-sdk/*' from externalModules to bundle your client with your function code. The preceding example uses a Step Functions state machine. Replace the state machine with your long-running workflow.
The externalModules: ['@aws-sdk/*'] configuration excludes the AWS SDK from the deployment bundle because the SDK is already available in the Node.js Lambda runtime.
You can also customize bundling with externalModules to add dependencies, minify to reduce package size, and sourceMap to activate source maps for debugging. For more bundling options, see interface BundlingOptions.
Deploy and verify the custom resource
From the root of your AWS CDK project, run the following commands to synthesize the CloudFormation template and deploy the stack:
cdk synth cdk deploy
Note: The preceding commands use the default AWS CLI profile.
To confirm that the custom resource completed successfully, take one or more of the following actions:
- On the CloudFormation console, verify that the stack status changed from CREATE_IN_PROGRESS to CREATE_COMPLETE.
Note: The state machine includes a 5-minute Wait step. The CloudFormation CREATE_IN_PROGRESS state lasts about the same time. - Make sure that the state machine that the customer resource started is in the Succeeded status in the Executions section of the Step Functions console.
- Check Amazon CloudWatch Logs for the onEvent and isComplete Lambda functions. The onEvent log shows a single invocation at stack deploy. The isComplete log shows repeated invocations based on your queryInterval, and ends with {"IsComplete": true}.
Clean up your configuration
To avoid future charges, remove the resources that you created when you no longer require them. Run the following command from the root of your AWS CDK project:
cdk destroy
When AWS CDK removes the stack from your account, it removes the following resources:
- The Step Functions state machine
- The Lambda handler functions
- The Provider Framework's internal state machine
- Associated AWS Identity and Access Management (IAM) roles
- The CloudFormation stack
Note: By default, AWS CDK keeps log groups for the Lambda functions. To remove the log groups, manually delete the log groups. Or, configure your NodejsFunction constructs with a logRetention value that allows AWS CDK to manage the log group lifecycle.
Troubleshoot issues
The operation times out
If the operation times out, then increase the value for totalTimeout in the isComplete handler.
If the operation still times out, then check CloudWatch Logs for the isComplete handler to identify whether the operation is stuck. If the underlying operation is stuck, then manually cancel it. For example, stop the Step Functions execution. Then, increase the totalTimeout value.
The configuration didn't call the isComplete handler
Check CloudWatch Logs to verify that the onEvent handler returned a PhysicalResourceId and didn't receive an error. If there are errors, then make sure that the Lambda execution role has the required permissions to run the functions.
The operation completes but CloudFormation shows that it's in progress
Confirm that the isComplete handler returns { IsComplete: true } in the correct response format. For information about the expected response format, see Asynchronous providers: isComplete.
Related information
- Language
- English

https://repost.aws/questions/QUB8AQkrx_R1GrjH6JKW5VQg/unable-to-create-eks-cluster-via-aws-cli#COfygY3BCxTw6xoDGpzAj-oA What AWS help in
Relevant content
- asked 4 years ago