Help us improve the AWS re:Post Knowledge Center by sharing your feedback in a brief survey. Your input can influence how we create and update our content to better support your AWS journey.
Implementing automated incident response with Incident Detection and Response and Dynatrace integration
Learn how to integrate Dynatrace with AWS Incident Detection and Response to automate incident response and create context-rich support cases that expedite issue resolution.
Introduction
Organizations that use observability tools often face delays during critical incidents because of manual processes when creating support tickets. While these tools excel at detecting anomalies, the handoff to incident response can often create bottlenecks. To address this challenge, Incident Detection and Response automates the integration between third-party application monitoring tools, such as Dynatrace, Datadog, and Splunk. Incident Detection and Response uses Amazon EventBridge to deliver a 5-minute response to the business-critical alarms onboarded to Incident Detection and Response.
Solution overview
This solution creates an automated workflow between Dynatrace and Incident Detection and Response. When Dynatrace detects a problem, it initiates a workflow that performs a PutEvents operation to an EventBridge event bus. When you onboard the event bus to Incident Detection and Response, the event bus automatically creates a detailed support case. This support case includes comprehensive context and telemetry data to get a 5-minute response from AWS Support. This integration reduces manual data gathering and accelerates incident resolution.
The solution consists of four primary components, as seen in Figure 1:
- Dynatrace instance: Monitors AWS workloads and detects issues.
- Dynatrace workflows: Triggers automated actions when the solution detects a problem.
- EventBridge event bus: Routes notifications from Dynatrace to Incident Detection and Response.
- Incident Detection and Response: Automatically creates contextualized support cases so that AWS Support Incident Management Engineers can respond in 5 minutes.
Figure 1: Dynatrace to Incident Detection and Response integration architecture
Prerequisites
Dynatrace requirements:
- Active Dynatrace Software as a Service (SaaS) instance
- Configured AWS workload monitoring
- Administrative permissions for creating connections and workflows
AWS requirements:
- AWS Unified Operations or Enterprise Support plan with an Incident Detection and Response entitlement
- EventBridge access and appropriate AWS Identity and Access Management (IAM) permissions
- Ability to create Identity Providers and IAM roles
Network requirements:
- Outbound HTTPS access from Dynatrace to AWS
- Access to *.amazonaws.com domains from Dynatrace
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
Solution implementation
To implement this solution, complete the following tasks:
- For AWS, create a custom event bus, OpenID Connect (OIDC) identity provider, and IAM policies. This configuration allows Dynatrace to perform a PutEvents operation to this event bus.
- For Dynatrace, use the OIDC provider and role to add a connection to AWS. Then, create a workflow that transforms events and sends them to an EventBridge event bus.
- Configure the Incident Detection and Response integration to onboard the event bus with Incident Detections and Response managed rules. These rules route business-critical alarms from Dynatrace to AWS Support.
- Test your solution implementation.
Configuring EventBridge
To create a custom event bus, run the following AWS CLI command:
aws events create-event-bus \
--name dynatrace-idr-bus \
--region us-east-1
Note: In the preceding example, the event bus name is dynatrace-idr-bus. This command creates an event bus in the us-east-1 AWS Region.
Configuring IAM and OIDC authentication
Use either the IAM console or the AWS CLI to create the required OIDC authentication provider and IAM role.
IAM console
Complete the following steps:
- Open the IAM console.
- In the navigation pane, choose Identity providers, and then choose Add provider.
- For Configure provider, choose OpenID Connect.
- For Provider URL, enter https://token.dynatrace.com.
- For Audience, enter YOUR-TENANT.apps.dynatrace.com/app-id/dynatrace.aws.connector.
AWS CLI
To get the thumbprint for the Dynatrace OIDC provider, run the following AWS CLI command:
echo | openssl s_client -servername token.dynatrace.com -showcerts -connect
token.dynatrace.com:443 2>/dev/null | openssl x509 -fingerprint -sha1 -noout | cut -d'=' -f2 |
tr -d ':' | tr '[:upper:]' '[:lower:]'
To create the OIDC provider, use the Dynatrace URL and thumbprint to run the following AWS CLI command:
aws iam create-open-id-connect-provider \
--url https://token.dynatrace.com \
--client-id-list YOUR-TENANT.apps.dynatrace.com/app-id/dynatrace.aws.connector \
--thumbprint-list YOUR_THUMBPRINT_HERE
Create the IAM role
After you create the identity provider, you must create an IAM trust policy that grants Dynatrace access to assume an IAM role. Create a JSON file with the policy details to use when you create the IAM policy. In the following example, the JSON file name is dynatrace-trust-policy.json:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::YOUR-ACCOUNT-ID:oidc-provider/token.dynatrace.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.dynatrace.com:aud": "YOUR-TENANT.apps.dynatrace.com/app-id/dynatrace.aws.connector"
}
}
}]
}
Note: Replace YOUR-ACCOUNT-ID and YOUR-TENANT with your values.
Use the JSON file that you created to run the following command and create an IAM role:
aws iam create-role \
--role-name DynatraceIDRRole \
--assume-role-policy-document file://dynatrace-trust-policy.json \
--description "Role for Dynatrace IDR integration"
Note: Replace DynatraceIDRRole with the name of your IAM role.
Create a policy document
Create a policy document that grants specific permissions to perform the PutEvents action to a specified event bus and attach to the IAM role. In the following example, the file name is dynatrace-idr-policy.json:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"events:PutEvents"
],
"Resource": [
"arn:aws:events:*:*:event-bus/dynatrace-idr-bus"
]
},
{
"Effect": "Allow",
"Action": [
"account:ListRegions"
],
"Resource": "*"
}
]
}
Note: Replace dynatrace-idr-bus with the name of your event bus.
To create a policy, use the JSON file that you created to run the following command:
aws iam create-policy \
--policy-name DynatraceIDRPolicy \
--policy-document file://dynatrace-idr-policy.json \
--description "Policy for Dynatrace IDR integration to send events to EventBridge"
The output includes the policy Amazon Resource Name (ARN) and looks similar to the following example output:
arn:aws:iam::YOUR-ACCOUNT-ID:policy/DynatraceIDRPolicy
To grant your IAM role the permissions needed to perform the PutEvents operation, attach the policy ARN that’s returned to the role:
aws iam attach-role-policy \
--role-name DynatraceIDRRole \
--policy-arn arn:aws:iam::YOUR-ACCOUNT-ID:policy/DynatraceIDRPolicy
To make sure that you correctly configured the role, run the following command:
aws iam list-attached-role-policies --role-name DynatraceIDRRole
{
"AttachedPolicies": [
{
"PolicyName": "DynatraceIDRPolicy",
"PolicyArn": "arn:aws:iam::YOUR-ACCOUNT-ID:policy/DynatraceIDRPolicy"
}
]
}
Configuring Dynatrace integration
Complete the following steps to configure your Dynatrace integration.
Create the AWS connection in Dynatrace
Complete the following steps:
- Open Dynatrace.
- In the navigation pane, under Settings, choose Connections, and then select AWS.
- Choose Add Connection, and then enter a name for the connection, such as AWS-IDR-EventBridge.
- For Type, select Web identity.
- Choose Create. Save the connection ID that’s created.
Note: When you create the connection, the ARN field activates for you to edit. - Add the ARN from the IAM role that you created to the Dynatrace connection, and then choose Save.
- In the IAM console, add the Dynatrace connection ID to the IAM role trust policy, as seen in the following example:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::YOUR-ACCOUNT-ID:oidc-provider/token.dynatrace.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.dynatrace.com:aud": "YOUR-TENANT.apps.dynatrace.com/app-id/dynatrace.aws.connector",
"token.dynatrace.com:sub": "dt:connection-id/YOUR-CONNECTION-ID"
}
}
}]
}
Note: In the preceding example, replace YOUR-ACCOUNT-ID with the AWS Account ID and YOUR-TENANT with the Dynatrace instance ID. Replace YOUR-CONNECTION-ID with the ID of the connection that you created.
Configure external requests in Dynatrace
For outbound connections to occur, you must configure Dynatrace to allowlist domains for outbound connections:
- Open Dynatrace.
- In the navigation pane, under Settings, choose General. Then, choose External requests.
- For New host pattern, enter *.amazonaws.com. This host pattern allows Dynatrace to make a connection to the event bus.
Create a Dynatrace workflow
Workflows are features in Dynatrace that you can configure to react to significant events or problems. These workflows can perform a PutEvents operation to an EventBridge event bus.
To create a workflow, complete the following steps:
- Open Dynatrace.
- In the navigation pane, under Apps, choose Software Delivery. Then, choose Workflows.
- Configure your event filter. In this example, the Davis problem trigger is configured.
- For the event filter, add the AWS EventBridge – PutEvents action.
- For Configure connection for action, select the connection that you created.
- For Region, choose the Region where you created the event bus. In the following example, the Region is us-east-1.
- For Parameters, include a JSON template to format the message for Incident Detection and Response:
[{
"DetailType": "ams.monitoring/generic-apm",
"Source": "GenericAPMEvent",
"Detail": "{{ {
"severity": "CRITICAL",
"problemId": event()["display_id"],
"incident-detection-response-identifier": event()["event.name"],
"description": event()["event.description"],
"status": event()["event.status"],
"category": event()["event.category"],
"startTime": event()["event.start"],
"affectedEntities": event()["affected_entity_ids"],
"dynatraceUrl": "https:// YOUR-TENANT.apps.dynatrace.com/ui/problems/" +
event()["display_id"]
} | tojson | replace('\\', '\\\\') | replace('"', '\\"') }}",
"EventBusName": "dynatrace-idr-bus"
}]
Note: For DetailType, enter ams.monitoring/generic-apm. For Source, enter GenericAPMEvent. For incident-detection-response-identifier, enter the name of the Incident Detection and Response alarm name. This name triggers the correct runbook. In this example, the alarm name is Davis Problem Name. This template uses tojson to make sure that the string is in JSON. Additionally, the template that forces the Detail: field is a JSON string for the event bus to accept it.
- Save and deploy the workflow.
Configuring Incident Detection and Response integration
For Incident Detection and Response to ingest alarms from your account, use the AWS CLI to create the service-linked role AWSServiceRoleForHealth_EventProcessor:
aws iam create-service-linked-role \
--aws-service-name event-processor.health.amazonaws.com
This role is required for the Incident Detection and Response team to deploy the custom rules and onboard your workload. After you create this role, contact your Technical Account Manager (TAM) to onboard this workload to Incident Detection and Response.
Testing the integration
After you complete the integration steps, test the integration.
Note: Before you deploy this solution to production, it’s a best practice to thoroughly test the solution in a non-production environment.
- To test event delivery, run the following command to confirm that the EventBus can access putEvents:
aws events put-events \
--entries '[{
"DetailType": "ams.monitoring/generic-apm",
"Source": "GenericAPMEvent",
"Detail": "{\"problemId\":\"TEST-123\",\"severity\":\"CRITICAL\",\"problemTitle\":\"Test:
For Dynatrace\"}",
"EventBusName": "dynatrace-idr-bus"
}]' \
--region us-east-1
In the preceding command, the command simulates a Dynatrace incident alert sent to your custom event bus.
- To capture and process the test event and prove that the AWS part of the integration works, run the following command to create a test rule:
aws events put-rule \
--name DynatraceToIDR \
--event-pattern '{"DetailType": ["ams.monitoring/generic-apm"], "source":
["GenericAPMEvent"]}' \
--event-bus-name dynatrace-idr-bus \
--region us-east-1
Note: After you onboard to Incident Detection and Response, delete this test rule.
- To verify that the rule matches, check the Amazon CloudWatch metrics under your events. You can also review the Invocations metric for DynatraceToIDR. For more information, see Monitoring Amazon EventBridge.
After you onboard your workload, the Incident Detection and Response team works with you to run a game day. During the game day event, you complete the following tasks to test the integration:
- Initiate a test problem in Dynatrace.
- Verify that the workflow executes.
- Confirm that you can find the EventBridge event receipt.
- Check for the Incident Detection and Response support case creation.
- Validate context and telemetry data.
Troubleshooting
During your tests, if the integration doesn’t work as you expect, then complete the following steps to review the logs for events sent to EventBridge:
- To create a log group, run the following command:
aws logs create-log-group \
--log-group-name /aws/events/dynatrace-idr-bus-debug \
--region us-east-1
- Create a custom rule to capture all the events:
# Create a rule that matches ALL events on the bus
aws events put-rule \
--name DynatraceEventCapture \
--event-pattern '{"source": [{"prefix": ""}]}' \
--state ENABLED \
--event-bus-name dynatrace-idr-bus \
--region us-east-1
- To see the payload in the response message, add the CloudWatch logs as a target:
aws events put-targets \
--rule DynatraceEventCapture \
--event-bus-name dynatrace-idr-bus \
--targets "Id"="1","Arn"="arn:aws:logs:YOUR-REGION:YOUR-ACCOUNT-ID:log-group:/aws/events/dynatrace-idr-bus-debug" \
--region us-east-1
- Initiate a flow in Dynatrace. Then, run the following AWS CLI command to review the logs and see that the payload sent:
aws logs filter-log-events \
--log-group-name /aws/events/dynatrace-idr-bus-debug \
--start-time $(($(date +%s) - 600))000 \
--region us-east-1
Note: In the preceding example command, the command shows the messages for the previous 10 minutes.
- Delete the log group that you created for the test:
aws events delete-rule \
--name DynatraceEventCapture \
--event-bus-name dynatrace-idr-bus \
--region us-east-1
- (Optional) Delete the logs that you captured from the test:
aws logs delete-log-group \
--log-group-name /aws/events/dynatrace-idr-bus-debug \
--region us-east-1
Cleanup
To avoid ongoing charges, delete the following resources when your testing is complete:
- EventBridge custom event bus
- IAM roles and policies
- Dynatrace workflow configurations
- OIDC provider settings
Conclusion
This integration streamlines incident response by automating the critical handoff between Dynatrace monitoring and AWS Support. To reduce the mean time to resolution (MTTR), the integration does the following:
- Reduces manual information gathering
- Provides AWS Support engineers with immediate access to comprehensive telemetry data
The solution maintains the existing process while enhancing the response with automated AWS expertise.
To learn more about how AWS Support plans and offerings can help you get the most out of your AWS environment, see AWS Support. For implementation support, contact Dynatrace Support for workflow and monitoring configuration assistance. Or, contact AWS Support for EventBridge and Incident Detection and Response configuration guidance.
About the authors
Llewellyn Crossley
Lewellyn Crossley is a Senior Incident Management Engineer at AWS who works within AWS Incident Detection and Response. They contribute to developing and improving incident management processes that enhance customer experience during critical operational events. Their focus areas include incident response optimization, escalation management, and incident management frameworks.
Yomesh Shah
Yomesh Shah is a Senior Solutions Architect at AWS. He brings 25 years of experience helping enterprises maximize the value of their IT investments through optimization, automation, and process improvement. He currently helps AWS customers leverage and apply scalable AWS Support solutions. Yomesh also holds a patent for the design of a Managed Services control plane in the cloud (US11856055B2).
- Topics
- ServerlessApplication IntegrationManagement & GovernanceSecurity, Identity, & ComplianceAWS Well-Architected Framework
- Tags
- Operational ExcellenceAWS Command Line InterfaceAWS Identity and Access ManagementAWS Incident Detection and ResponseAmazon EventBridge
- Language
- English

Relevant content
- asked 4 years ago