How do I set up EventBridge retries and DLQ for failed invocations?

4 minute read
0

I want to set up Amazon EventBridge retries and dead-letter queue (DLQ) to troubleshoot a FailedInvocations issue.

Short description

You might run into issues where the EventBridge rule fails to invoke the target. You can verify this by reviewing the Amazon CloudWatch metric FailedInvocations for the EventBridge rule. The FailedInvocations metric represents the number of invocations that failed permanently. This usually occurs because of an issue with permissions or configuration of the target resource, or because of network conditions.

Resolution

Set up EventBridge retries and DLQ to troubleshoot and resolve the issue with FailedInvocations.

EventBridge retries

When an event isn't successfully delivered to a target because of retriable errors, EventBridge retries sending the event.

By default, EventBridge retries sending the event for 24 hours and up to 185 times with an exponential back off and jitter, or randomized delay. You can customize the length of time that it tries and the number of retry attempts in the retry policy settings for the target.

Note: EventBridge retries responses sent with a 5xx or 429 HTTP status code for up to 24 hours. EventBridge doesn't retry other 4xx HTTP errors.

Follow these steps to configure the retry policy using the Amazon EventBridge target:

  1. In the EventBridge console, open the EventBridge rule that you want to configure the retry policy for.
  2. Choose the Targets tab. Then, choose Edit.
  3. Expand the Additional settings tab of the target that you want to set up the retry policy for.
  4. Enter the custom value for Retry attempts (default value is 185 times).

Note: Set up the retry policy individually for each target. If you have an event rule with multiple targets, set up the retry policy for each target.

To configure a retry policy using the AWS Command Line (AWS CLI), use the put-targets command.
Note: If you receive errors when you run AWS CLI commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

Event error details

EventBridge handles event errors in different ways. For example, EventBridge might not deliver an event to the target because of missing permissions for a target. Or it might deliver because a target no longer exists. Instead of retrying, EventBridge might drop the event or send it to DLQ, if you have one configured.

EventBridge DLQ

Configure a DLQ to avoid losing events because of delivery failure. The DLQ receives all failed events for processing later.
Note: By default, DLQ isn't configured for an EventBridge target. You must create an Amazon Simple Queue Service (Amazon SQS) queue before configuring DLQ for an event rule target.

With DLQ, EventBridge publishes additional CloudWatch metrics: InvocationSentToDLQ and InvocationsFailedTobeSentToDLQ. For more information, see EventBridge metrics.
Note: The InvocationsFailedTobeSentToDLQ error mostly occurs when DLQ SQS is encrypted without the necessary permission in the KMS Key policy. It also occurs if it's missing permissions in the SQS resource policy.

To configure DLQ using the Amazon EventBridge console, follow these steps:

  1. Log in to the Amazon EventBridge console.
  2. Open the EventBridge rule that you want to configure the retry policy for.
  3. Choose the Targets tab. Then, choose Edit.
  4. Expand the Additional settings tab of the target that you want to set up the retry policy for.
  5. In the Dead-letter queue section, select the appropriate option based on whether your DLQ SQS is in the same account or a different account.

To configure a DLQ using the AWS CLI, use the put-targets command. Use the Amazon Resource Name (ARN) of the SQS queue for the parameter DeadLetterConfig.

Retrieve error details from DLQ

Review the metrics InvocationSentToDLQ to confirm if the event rule sent the message to DLQ. If there's a message, then follow these steps:

  1. In the Amazon SQS console, choose the queue that's being used as DLQ for an event rule.
  2. Choose Send and receive messages on the top right. Then, choose Poll for messages.
  3. Locate the message in the Messages section, and then open the message. Choose Attributes to show details why EventBridge couldn't deliver a message to the event rule target.
    Note: Data in the Body tab is the actual payload that EventBridge would send to the target if the event delivery didn't fail.
AWS OFFICIAL
AWS OFFICIALUpdated 7 months ago