Skip to content

Integrating PagerDuty with Incident Detection and Response to improve critical workload monitoring

13 minute read
Content level: Expert
0

This article shows you how to integrate PagerDuty with AWS Incident Detection and Response to improve your incident management capabilities.

Introduction

In today's cloud-first world, organizations need robust incident detection and response capabilities to maintain high availability and reliability. AWS Incident Detection and Response provides 24/7 proactive monitoring with a response time of 5 minutes, and provides complete incident management for critical workloads that run on AWS. The service combines incident managers with context-aware AWS service experts who have access to advanced AWS tooling to help customers respond and resolve critical events quickly and effectively.

PagerDuty is an online incident management platform that aggregates digital signals from various custom application monitoring tools. Through this combination, PagerDuty can detect, triage, and automatically route issues to the right on-call personnel.

You can integrate Incident Detection and Response with your PagerDuty platform to create unified monitoring and use the AWS rapid response and incident management ecosystem without changing your current operating model. This integration allows PagerDuty's critical observability alerts to flow seamlessly into the Incident Detection and Response framework. As a result, you can reduce the mean time to detect incidents along with faster resolution.

Integration architecture

The integration between Incident Detection and Response and PagerDuty uses Amazon EventBridge and AWS Lambda to process and route monitoring alerts. You can configure PagerDuty to automatically send alert notifications to EventBridge when a monitor alert condition is met, as seen in the following image:

Image

The architecture demonstrates the event flow from PagerDuty monitoring through Incident Detection and Response:

  1. PagerDuty monitors your applications and infrastructure and generates alerts based on your defined monitor thresholds and conditions.

  2. A partner event source is a connection in EventBridge that allows PagerDuty to send events to your AWS account. An EventBridge partner event bus serves as the event bus that receives alerts from PagerDuty through the partner event source.

  3. EventBridge rules route relevant alerts to a Lambda function for transformation.

  4. Lambda transforms the PagerDuty alert payload into the format that Incident Detection and Response requires.

  5. A custom EventBridge event bus receives the transformed events for Incident Detection and Response consumption.

Incident Detection and Response receives these alerts and initiates the appropriate response procedures according to your engagement model.

Integration configuration

Prerequisites and assumptions

Make sure that you meet the following prerequisites:

  • The account that runs your critical applications is on Enterprise Support or Unified Operations plan that provides access to Incident Detection and Response.

  • The AWS Identity and Access Management (IAM) role or user that you use for this integration has access to create and update IAM permissions for EventBridge and Lambda.

  • Your PagerDuty user has the Manager, Admin, Global Admin, or Account Owner role to configure this integration.

Initial integration setup can take 15-30 minutes, and testing and validation can take 15-30 minutes. However, your setup time can vary depending on your familiarity with AWS services and existing PagerDuty configuration complexity.

Step 1 - Integrate PagerDuty with AWS

Complete the following steps to integrate PagerDuty with AWS:

  1. Open your PagerDuty console as one of the users with a valid role.

  2. Choose the Integrations tab.
    Image

  3. Look for the Amazon EventBridge extension, and then choose Add.

  4. Enter a name for the integration, your AWS account ID, AWS Region, and critical services that are running in your AWS account.

  5. For Event Source Name, enter a unique identifier.

  6. Choose Create.

Image

You see that EventBridge now appears as a new extension under the Service Extensions tab.

Note: After you complete the setup, PagerDuty automatically creates a partner event source in EventBridge in the selected AWS account and Region. The event source name uses the following format: aws.partner/pagerduty.com/<your-event-source-name>

Step 2 - Validate the PagerDuty EventBridge partner event source

Complete these steps for PagerDuty to deliver the alert notification to EventBridge:

  1. Open the EventBridge console

  2. In the Regions selector, choose the Region where you created the EventBridge integration in PagerDuty.

  3. In the navigation pane, under Integration, choose Partner event sources.
    You can see the PagerDuty integration under Partner event sources with the Active status, as shown in the following screenshot:

Image

  1. Select the partner event source, and then choose Associate with event bus. The status of the event source changes from Pending to Active, and the name of the event bus updates to match the partner event source name. You can now create rules that match events from the partner event source.

Step 3 - Create a custom EventBridge event bus

Incident Detection and Response uses a custom event bus to ingest transformed events that PagerDuty sends. To create a custom event bus that can ingest the transformed events, complete the following steps:

  1. Open the EventBridge console.

  2. In the navigation pane, under Buses, choose Event buses.

  3. Under Custom event bus, choose Create event bus.

  4. For Name, enter a name for your event bus. Example: **PagerDuty-AWSIncidentDetectionResponse-EventBus 5. For Description, enter a description for your event bus.

  5. Update other settings and options according to your company policies, or keep as default.

  6. Choose Create to create the event bus.
    You can find this new event bus on the Event buses page.

Image

Step 4 - Configure a Lambda function for PagerDuty payload transformation

Create a Lambda function

Create a Lambda function to transform the standard alert events from the PagerDuty to a format that Incident Detection and Response can ingest. The Lambda function transforms the PagerDuty alert payload and sends the events to the custom event bus (from the previous step) that Incident Detection and Response uses. To create the Lambda function, complete the following steps:

  1. Open the Lambda console.

  2. In the navigation pane, choose Functions.

  3. Choose Create function.

  4. On the Create function page, choose Author from scratch.

  5. For Function name, enter a name for your function. Example: **PagerDuty-AWSIncidentDetectionResponse-EventTransformer 6. For Runtime, select the appropriate Lambda runtime.
    Note: The Lambda function code in this article was tested for Python 3.14.

  6. Choose Create function.

Update your Lambda function

The next step is to set up this function to transform events between the partner event bus and the custom event bus that you created. Incident Detection and Response requires few specific key-value pairs in the alert payload, such as detail-type, source, and detail, as in the following JSON example. The Lambda function transforms the PagerDuty alert to include these required JSON key-value pairs.

{
   "detail-type": "ams.monitoring/generic-apm",
  "source": "GenericAPMEvent",
  "detail": {
    "incident-detection-response-identifier": "Your alarm name from your APM"
  }
}

Complete the following steps to deploy the function that you created:

  1. In the Lambda console, choose Functions, and then select the function that you just created.

  2. In the Code tab, copy and paste the following code to replace the default Lambda function. 
    Note: Replace PagerDuty-AWSIncidentDetectionResponse-EventBus with the name of your event bus. Make sure that the incident-detection-response-identifier variable is set as title in the PagerDuty alert payload. This variable is the most critical unique identifier for your critical alarm.

import logging

import json

import boto3

logger = logging.getLogger()

logger.setLevel(logging.INFO)

# Change the EventBusName to the custom event bus name you created previously or use your default event bus which is called 'default'.

# Example 'PagerDuty-AWSIncidentDetectionResponse-EventBus'

EventBusName = "PagerDuty-AWSIncidentDetectionResponse-EventBus"

def lambda_handler(event, context):

# Set the event["detail"]["incident-detection-response-identifier"] value to the name of your alert that is coming from your APM.

# Each APM is different and each unique alert will have a different name.

# This example is for finding the alert name for PagerDuty.

event["detail"]["incident-detection-response-identifier"] = event["detail"]["incident"]["title"]

logger.info(f"We got: {json.dumps(event, indent=2)}")

client = boto3.client('events')

response = client.put_events(

Entries=[

{

'Detail': json.dumps(event["detail"], indent=2),

'DetailType': 'ams.monitoring/generic-apm', # Do not modify. This DetailType value is required.

'Source': 'GenericAPMEvent', # Do not modify. This Source value is required.

'EventBusName': EventBusName # Do not modify. This variable is set at the top of this code as a global variable. Change the variable value for your eventbus name at the top of this code.

}

]

)

print(response['Entries'])

  1. Choose Deploy.

Grant the required permissions to your function

Complete the following steps to grant the required permissions to the Lambda function to send the transformed events to the custom event bus:

  1. In the Lambda console, choose Functions, and then select the function that you just created.

  2. Choose Configuration, and then choose Permissions.

  3. Under Execution role, choose View role details in IAM.
    The execution role opens in the IAM console.

  4. In the IAM console, choose **Permissions. 5. Under Permissions policies, choose the existing policy name.
    The policy opens in a new IAM console page.

  5. Choose Edit.

  6. Under Policy editor, choose Add new statement.
    The policy editor adds a new blank statement.

  7. Copy and paste the following policy to replace the blank statement. Replace Resource with the Amazon Resource Name (ARN) of the custom event bus that you created.

{

"Sid": "AWSIncidentDetectionResponseEventBus",

"Effect": "Allow",

"Action": "events:PutEvents",

"Resource": "arn:aws:events:{region}:{accountId}:event-bus/PagerDuty-AWSIncidentDetectionResponse-EventBus"

}

  1. Choose Next, and then choose Save changes. Confirm that the required permission is added to the role.

Step 5 - Create an EventBridge rule

Create an EventBridge rule on your PagerDuty partner event bus to forward the events from the PagerDuty alerts to the Lambda function for transformation. The function transforms the event and forwards it to the Incident Detection and Response service bus. Then, Incident Detection and Response automatically selects the alerts from the bus. Complete these steps:

  1. Open the EventBridge console.

  2. In the navigation pane, under Buses, choose Rules.

  3. Under Select event bus, for Event bus, select the PagerDuty partner event bus that you created. Example: aws.partner/pagerduty.com/<your-event-source-name>

  4. Choose Create Rule.

  5. For Name, enter a name for your rule.

  6. For Description, enter a description for your rule.

  7. Select Enable the rule on the selected event bus.

  8. Choose Next.

  9. For Event source, select AWS events or EventBridge partner events.

  10. For Event pattern, select Use pattern form.

  11. For Event source, choose EventBridge partners.

  12. For Partner, select PagerDuty.

  13. For Event type, select All Events.

  14. Choose Next.

  15. In the Select target(s) page, for Target types, select AWS service.

  16. For Select a target, select Lambda function.

  17. For Target location, select Target in this account.

  18. For Function, select the function that you created. Example: PagerDuty-AWSIncidentDetectionResponse-EventTransformer

  19. For Permissions, select Use execution role.

  20. For Execution role, select Create a new role for this specific resource.

  21. Keep the value for Role name as is.

  22. Choose **Next. 23. (Optional) Add a new tag or tags, as needed.

  23. Review the rule, and then choose Create rule.

The rule that you created looks like the following:

Image

Test the integration

Complete the following steps to verify that the integration works:

  1. To trigger a test alert in the PagerDuty console, create a new incident alert or modify an existing one.

  2. Check the Lambda function’s Amazon CloudWatch Logs to verify that the function received and processed the PagerDuty alert.

  3. Check the CloudWatch metrics for the custom event bus. Verify that the EventBridge delivers the events to the custom event bus that you created.

If your integration is successful, then PagerDuty alerts appear in the Lambda function logs with the full event payload. Also, EventBridge publishes the events to the custom event bus with the required field incident-detection-response-identifier. To view the complete transformed event payload, configure logging at the INFO level on this custom event bus.

Troubleshoot common issues

PagerDuty partner event source doesn’t appear in EventBridge

Symptoms:

  • The partner event source isn’t visible in the EventBridge console.

  • The status of the PagerDuty partner event source shows as Pending instead of Active.

Possible causes:

  • Your PagerDuty role doesn’t have the required permissions.

  • You didn’t correctly configure the EventBridge integration in the PagerDuty console.

  • You selected the incorrect AWS account or Region during setup.

  • You didn’t create an EventBridge rule to route PagerDuty alerts.

Resolution steps:

  • Make sure that you selected the correct Region in the AWS Management Console when you configured the integration.

  • Wait a few minutes for the partner event source to appear after configuration.

  • If the issue persists, then recreate the EventBridge integration in PagerDuty.

  • Verify that the PagerDuty AWS integration is active and properly configured in both consoles.

  • If you didn’t create the EventBridge rule already, then create one to route the PagerDuty alerts.

Your Lambda function doesn’t receive the events

Symptoms:

  • No logs appear in CloudWatch for the Lambda function.

  • EventBridge rule shows no invocations.

Possible causes:

  • You didn’t configure the EventBridge rule properly.

  • The event pattern matching is incorrect.

  • The Lambda function doesn’t have the required permissions.

Resolution steps:

  • Verify that the EventBridge rule targets the correct Lambda function.

  • Make sure that you turned on the rule and the rule is active.

  • Make sure that you have the correct trigger for the Lambda function.

  • Make sure that the IAM roles that are associated with your Lambda function have the required permissions

  • Test the EventBridge rule in the AWS Command Line Interface (AWS CLI) or console.

  • Review CloudWatch metrics for the EventBridge rule to confirm that EventBridge processes the events.

PagerDuty monitors don’t send notifications to EventBridge

Symptoms:

  • PagerDuty triggers the monitor alerts, but no events appear in EventBridge.

Possible causes:

  • You didn’t correctly configure the EventBridge integration correctly in the PagerDuty console.

  • You selected the incorrect AWS account or Region during setup.

  • You didn’t create an EventBridge Rule to route PagerDuty alerts.

Resolution steps:

  • Verify that the source name in Partner Event Sources in the EventBridge console contains the name of your PagerDuty’s event source name.

  • Review CloudWatch metrics for the EventBridge rule to confirm that EventBridge processes the event.

  • If you didn’t create an EventBridge rule already, then create one to route the PagerDuty alerts.

For more information, see AWS Support or contact your Technical Account Manager (TAM).

Conclusion

You can integrate Incident Detection and Response with PagerDuty to create a proactive monitoring and response system for your critical workloads on AWS. This integration uses your existing PagerDuty alerting investment and adds AWS expertise in the incident management lifecycle. Your configuration that routes PagerDuty alerts through EventBridge and Lambda automatically escalates critical alerts to the Incident Detection and Response team for rapid resolution.

To get started with Incident Detection and Response, submit the Contact us form or contact your AWS account manager.

About the author

Image

Nitin Verma

Nitin is a Principal Solutions Architect specializing in cloud operations and AWS Support solutions. With over a decade of experience in cloud migration, modernization, and DevSecOps, Nitin helps customers achieve operational excellence in the AWS Cloud.