Following best practices to create and manage your AWS Support cases
This article suggests some best practices that you can follow when you create and manage AWS Support cases.
Introduction
AWS Support offers a comprehensive suite of services to enhance your cloud experience and accelerate your business outcomes. In addition to assisting with operational disruptions, AWS Support provides proactive planning, communications, advisory services, automation support, and cloud expertise. The support engineering team is extensively trained across all AWS technical domains. They provide a wide range of assistance, including general guidance, how-to advice, best practice recommendations, troubleshooting, and operational support. When you face challenges with your AWS resources, particularly issues that affect your production workloads, you can create support cases to start the problem-resolution process. To help significantly reduce the time that support engineering requires to resolve your issues, create support cases with complete and detailed information about the issue.
This article outlines the best practices for creating and managing support cases. These guidelines aim to streamline your interactions with support engineering, reduce resolution times, and ultimately help you to use your AWS services effectively.
Best practices
Review health events
Before you create a support case, follow these steps:
-
To review events that might be impacting resources within your account, see the Your account health section in the AWS Health Dashboard.
-
Identify and prioritize the events. To prioritize, focus on the severity of the event and the affected services and resources within your account.
-
Implement the remediation steps that AWS recommends. These steps might include actions, such as initiating a failover to a different Availability Zone, modifying resource configurations, or contacting AWS Support for further assistance. To manage and mitigate the impact of events on your AWS resources effectively, check your dashboard regularly. Take prompt actions based on the information that's provided.
-
To review events that are related to AWS service issues, see the Service health section in the AWS Health dashboard. This dashboard provides a comprehensive view of the current operational status and performance of AWS services across all Regions. It displays up-to-date information on any service disruptions, scheduled maintenance, or operational issues that might affect your applications and workloads.
If you aren't sure about a potential impact, then create a support case. Proactive engagement with AWS Support is always your best course of action.
Create a support case
For information on how to create a support case, see Creating a support case. This section details all the best practices that you can follow when you create and manage a support case.
Choose your primary service
If your issue is related to multiple AWS services, for Service, select the primary service that's involved.
Provide clear and concise information
Be sure to include a clear and concise description of the issue in your support case, such as the actual issue and expected behavior. When you add context in your support case, it helps support engineering to customize the resolution according to your needs. For example, mention affected resources, integration to other systems, where relevant, AWS Regions where the workload operates, and expected outcomes. Include other relevant details such as error messages, symptoms, and the impact on your applications or services.
The following is an example of an unclear description:
"Our AWS resources are not working properly. Can you help fix this?"
This description might require multiple follow-up questions from the AWS Support team to understand the following details:
- What are the specific AWS services that are affected?
- What are the symptoms or errors?
- When did the issue start?
- What's the impact on your operations?
- What troubleshooting steps, if any, did you already take?
The following is an example of a clear description:
"Our EC2 instances in the us-west-2 region are experiencing high CPU utilization (consistently above 90%) since 2024-10-01 at 18:00 UTC. This is affecting our web application's response time, increasing it from an average of 200ms to over 2 seconds. We've already checked our application logs and found no unusual activity. The instances are t3.medium and part of an Auto Scaling group named "WebApp-frontend-asg". We expect CPU utilization to remain below 70% under normal conditions."
Provide comprehensive information that's gathered upfront to significantly reduce the resolution time and get personalized assistance.
Include ARNs
In your support case, be sure to always share the Amazon Resource Names (ARNs) for the impacted resources. This helps the support team to locate and identify specific resources quickly and expedite the troubleshooting process.
You can identify the ARNs with one of the following approaches:
-
AWS Management Console: Open the relevant AWS service console, locate the resource, and find the ARN in the details for the resource.
-
AWS API or AWS Command Line Interface (AWS CLI): Look for the relevant service in the AWS CLI Command Reference. Depending on the AWS service, look for the relevant operation, such as describe or get.
Example of an ARN:
arn:aws:ec2:us-west-2:123456789012:instance/i-1234567890abcdef0
When you create a case with AWS Support, provide the complete and accurate ARN. However, when you share your ARNs in a public forum, replace any sensitive information, such as your actual AWS account ID, with placeholder text. The following is the general format of an ARN:
arn:aws:ec2:us-west-2:<AccountID>:instance/<ResourceID>
Include timestamps for events
Include timestamps based on your time zone for when the event occurred or when you first observed the issue. If there are other relevant events, then include the timestamps for those events, along with the source and destination systems. Timely and accurate timestamps can help the AWS Support team correlate events, identify patterns, and understand the timeline of the issue.
To find relevant timestamps, follow these steps:
- Amazon CloudWatch Logs: Check CloudWatch Logs for your application or service logs. The timestamp for each log entry is displayed in UTC.
- AWS CloudTrail: Check CloudTrail logs for API activity. These logs use UTC timestamps.
- Application logs: If you're collecting logs directly on your instances or in a centralized logging system, then check these logs for relevant timestamps.
- Monitoring tools: Third-party monitoring tools provide dashboards with timestamped events.
- System logs: For Amazon Elastic Compute Cloud (Amazon EC2) instances, you can check system logs for relevant timestamps.
When you report timestamps, convert them to your local time zone for consistency. Mention the time zone that you're using.
To learn how to report timestamps, see the following example:
Issue Timeline (All times in EDT – UTC-4):
• 2024-10-01 18:30: First observed increased latency in our application (Source: New Relic monitoring)
• 2024-10-01 18:32: CloudWatch alarms triggered for high CPU usage on EC2 instances in us-west-2 (Source: AWS CloudWatch)
• 2024-10-01 18:35: Database connection timeouts started occurring (Source: Application logs in CloudWatch)
• 2024-10-01 18:40: Attempted to scale up EC2 instances, but Auto Scaling group failed to launch new instances (Source: AWS Auto Scaling logs in CloudTrail)
• 2024-10-01 18:45: Initiated this support case
Detailed timestamps provide AWS Support a clear chronology of events and help the team quickly understand the sequence and potential relationships between different aspects of the issue.
Add detailed troubleshooting steps
In your support case, describe the troubleshooting steps that you tried before you created the support case. Add information about what worked and what didn’t. Also, if applicable, include clear steps to reproduce the problem that you’re facing. This provides the AWS Support team with context on your efforts to resolve the issue independently. The following is an example of how you can provide your troubleshooting steps:
Before creating this case, I performed the following troubleshooting steps:
• Checked Systems Manager for resource status (2024-10-01 19:00 EDT):
All EC2 instances showed as 'running'
No apparent issues with patch compliance
• Reviewed AWS Config logs (2024-10-01 19:15 EDT):
No recent configuration changes detected for the affected resources
All resources compliant with our established rules
• Analyzed CloudWatch metrics (2024-10-01 19:30 EDT):
Observed CPU utilization spike to 95% at 18:30 EDT
Network In/Out showed no unusual patterns
• Investigated Auto Scaling group behavior (2024-10-01 19:45 EDT):
Attempted to manually increase desired capacity
New instances failed to launch due to 'InsufficientInstanceCapacity' error
• Checked IAM permissions (2024-10-01 20:00 EDT):
Confirmed that the IAM role attached to the EC2 instances hasn't changed
Verified that the role has necessary permissions for our application
• Reviewed application logs in CloudWatch Logs (2024-10-01 20:15 EDT):
Noticed increased error rates coinciding with CPU spike
No clear indication of root cause in application logs
• Attempted to launch a new EC2 instance manually (2024-10-01 20:30 EDT):
Successfully launched in a different Availability Zone
Original AZ still returns 'InsufficientInstanceCapacity' error
These steps helped identify that the issue might be related to capacity in a specific Availability Zone, but I haven't been able to resolve the high CPU utilization or the Auto Scaling failures. Any guidance on next steps would be appreciated.
Information about your troubleshooting steps can significantly speed up the resolution process so that the AWS Support team can focus on unexplored areas or more advanced troubleshooting techniques.
Attach relevant log files and screenshots
Be sure to attach the log files, error outputs, and screenshots that are related to your issue in your support case. With this information, AWS Support can analyze the problem quickly and identify potential root causes. Follow these steps to find relevant information for different types of issues:
-
Amazon EC2 issues:
- AWS Systems Manager: Check Session Manager for console output. Use the Run Command to remotely and securely manage the configuration of your managed nodes at scale.
- Connect to your Amazon EC2 instance: Connect to your instance and access the log files.
-
Networking issues:
- VPC Flow Logs: Collect information about IP traffic to and from network interfaces in your Amazon Virtual Private Cloud (Amazon VPC).
- CloudWatch Logs: Check for any custom logging that you've set up for your networking components.
-
Database issues:
- Amazon Relational Database Service (Amazon RDS) Enhanced Monitoring: Collect real-time operating system metrics for your Amazon RDS instances.
- Amazon RDS log files: You can access these files through the AWS Management Console or AWS CLI.
-
Application issues:
- CloudWatch Logs: If you use CloudWatch Logs for application logging, then be sure to review them.
- AWS X-Ray: Use AWS X-Ray to collect data about traced requests to your applications.
-
Security issues:
- CloudTrail: Review API activity across your AWS infrastructure.
- Amazon GuardDuty: Check for any detected threats or unusual activities.
-
For serverless applications:
- AWS Lambda logs: You can view logs for Lambda functions through the Lambda console, CloudWatch console, AWS CLI, or CloudWatch API.
- API Gateway logs: You can turn these logs on to log requests and responses to CloudWatch Logs.
Be sure to remove any sensitive information before you attach logs or screenshots to your support case. Relevant logs and outputs can help the AWS Support team expedite the troubleshooting process.
Select the right severity level
When you submit your support case, select the severity level of the issue based on its impact on your operations. This helps make sure that AWS Support prioritizes the case appropriately and assigns the necessary resources for a timely resolution. For more information, see Choosing a severity. If the issue is urgent, then you can start a live chat to request immediate contact with a support engineer. The following figure shows the different severity levels and some examples.
Note: The severity level Business Critical System Down is available only for Enterprise Support and Enterprise On-Ramp customers.
Determine the severity level for your support case based on the following information:
- Impact of the issue on your business operations
- Percentage of affected users or transactions
- Availability of a workaround
- Urgency of a resolution
- Potential financial and reputational impact (for business-critical applications)
Review and submit the case
Before you submit your case, review the case details thoroughly for accuracy and completeness. Then, choose your preferred contact method for the case.
- For urgent or critical issues, you can request a callback from AWS Support. To do so, select Phone and enter the required details.
- For critical or urgent cases, make sure that you include contact details so that AWS Support can contact you, if needed.
- The quickest way to get help is to use the Chat or Call option for all severity levels.
Maintain consistent communication
To expedite resolution and resolve the issue collaboratively, respond promptly to AWS Support inquiries and provide any additional information, if requested. The following is an example of good case correspondence:
AWS Support: "Can you please check if the security group associated with your EC2 instance allows inbound traffic on port 443?"
Your response: "Thank you for your quick reply. I've checked the security group (sg-1234abcd) associated with the affected EC2 instance (i-1234567890abcdef0). It does allow inbound traffic on port 443 from 0.0.0.0/0. I've also verified that the instance's network ACL doesn't restrict this traffic. Is there anything else I should check regarding network connectivity?"
Get cross-account support
For data and security reasons, AWS Cloud Support engineers are restricted from reviewing the account where you create the case. Because of this, be sure to open the support case from the account that owns the impacted resources.
If your issue spans across multiple accounts, then complete either of the following steps:
- Create a case in each relevant account, and then include a request to link them.
- If you have an Enterprise Support or Enterprise On-Ramp account, then contact your Technical Account Manager (TAM). TAMs can help you turn on cross-account support for your accounts to facilitate and expedite resolution.
Follow these guidelines to make sure that AWS Support has the necessary access and information to address your concerns efficiently.
Understand case handling
After you submit the case, a support engineer for your technical domain takes ownership of your case and tries to resolve the issue with you. If the case requires advanced expertise, then the support engineer might involve subject matter experts (SMEs) internally. If your issue requires involvement of the service team, then the support engineer escalates the issue to the appropriate internal team to drive resolution. If you have an Enterprise Support and Enterprise On-Ramp account, then contact your TAM to help escalate and expedite the resolution.
Conclusion
AWS Support helps you maintain the health and performance of your cloud infrastructure. Some best practices outlined in this article include resource IDs, timestamps, detailed troubleshooting steps, and engaging consistently with AWS Support. By following these practices, your organizations can streamline the support process, expedite issue resolution, and enhance overall operational efficiency. Clear and comprehensive communication is key to a successful partnership with AWS Support to promote a faster resolution of your technical challenges and minimize the impact on your cloud operations.
Support engineers and TAMs can help you with general guidance, best practices, troubleshooting, and operational support on AWS. To learn more about our plans and offerings, see AWS Support.
About the authors
Harshit Shah
Harshit Shah, a Senior TAM at AWS based in Canada, brings more than 13 years of experience in databases, DevOps, and cloud architecture. His extensive background provides him with the ability to offer comprehensive insights that are tailored to each customer's unique needs. Harshit collaborates closely, understands customers’ challenges and goals, and leverages his expertise to propose innovative, value-driving solutions. Outside of work, he enjoys family time, traveling, playing cricket, and learning drums. Harshit's technical prowess, industry experience, and passion for customer success helps them provide cutting-edge guidance so that customers can learn how to navigate their cloud landscape.
Rav Bommakanti
Rav Bommakanti is a Senior TAM on the AWS Energy team. He’s passionate about solving complex customer problems. With more than 16 years of experience in IT across various domains and technologies, he brings vast expertise in developing resilient, cost-effective, and innovative solutions. In his free time, he enjoys traveling and photography.
Very useful article.
Relevant content
- asked 10 months agolg...
- Accepted Answerasked 2 years agolg...
- asked 7 months agolg...
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 4 months ago