How do I troubleshoot and resolve high CPU utilization on my Amazon RDS for MySQL or Amazon Aurora MySQL instance?

9 minute read
0

I'm experiencing high CPU utilization on my Amazon Relational Database Service (Amazon RDS) for MySQL DB instances or my Amazon Aurora MySQL-Compatible Edition instances. How can I troubleshoot and resolve high CPU utilization?

Short description

Increases in CPU utilization can be caused by several factors, such as user-initiated heavy workloads, multiple concurrent queries, or long-running transactions.

To identify the source of the CPU usage in your Amazon RDS for MySQL instance, review the following approaches:

  • Enhanced Monitoring
  • Performance Insights
  • Queries that detect the cause of CPU utilization in the workload
  • Logs with activated monitoring

After you identify the source, you can analyze and optimize your workload to reduce CPU usage.

Resolution

Using Enhanced Monitoring

Enhanced Monitoring provides a view at the operating system (OS) level. This view can help identify the cause of a high CPU load at a granular level. For example, you can review the load average, CPU distribution (system% or nice%), and OS process list.

Using Enhanced Monitoring, you can check the loadAverageMinute data in intervals of 1, 5, and 15 minutes. A load average that's greater than the number of vCPUs indicates that the instance is under a heavy load. Also, if the load average is less than the number of vCPUs for the DB instance class, CPU throttling might not be the cause for application latency. Check the load average to avoid false positives when diagnosing the cause of CPU usage.

For example, if you have a DB instance that's using a db.m5.2xlarge instance class with 3000 Provisioned IOPS that reaches the CPU limit, you can review the following example metrics to identify the root cause of the high CPU usage. In the following example, the instance class has eight vCPUs associated with it. For the same load average, exceeding 170 indicates that the machine is under heavy load during the timeframe measured:

Load Average Minute

Fifteen170.25
Five391.31
One596.74

CPU Utilization

User (%)0.71
System (%)4.9
Nice (%)93.92
Total (%)99.97

Note: Amazon RDS gives your workload a higher priority over other tasks that are running on the DB instance. To prioritize these tasks, workload tasks have a higher Nice value. As a result, in Enhanced Monitoring, Nice% represents the amount of CPU being used by your workload against the database.

After turning on Enhanced Monitoring, you can also check the OS process list that's associated with the DB instance. Enhanced monitoring shows a maximum of 100 processes. This can help you identify which processes have the largest impact on performance based on CPU and memory use.

In the operating system (OS) process list section of Enhanced Monitoring, review the OS processes and RDS processes. Confirm the percentage of CPU utilization of a mysqld or Aurora process. These metrics can help you confirm whether the increase in CPU utilization is caused by OS or by RDS processes. Or, you can use these metrics to monitor any CPU usage increases caused by mysqld or Aurora. You can also see the division of CPU utilization by reviewing the metrics for cpuUtilization. For more information, see Monitoring OS metrics with Enhanced Monitoring.

Note: If you activate Performance Schema, then you can map the OS thread ID to the process ID of your database. For more information, see Why is my Amazon RDS DB instance using swap memory when I have sufficient memory?

Using Performance Insights

You can use Performance Insights to identify the exact queries that are running on the instance and causing high CPU usage. First, activate Performance Insights for MySQL. Then, you can use Performance Insights to optimize your workload. Be sure to consult with your DBA.

To see database engines that you can use with Performance Insights, see Monitoring DB load with Performance Insights on Amazon RDS.

Using queries to detect the cause of CPU utilization in the workload

Before you can optimize your workload, you must identify the problematic query. You can run the following queries while the high CPU issue is occurring to identify the root cause of the CPU utilization. Then, optimize your workload to reduce your CPU usage.

The SHOW PROCESSLIST command shows you the threads that are running currently on your MySQL instance. Sometimes, the same set of statements might continue running without completion. When this happens, the subsequent statements must wait for the first set of statements to finish. This is because InnoDB row-level locking might be updating the same rows. For more information, see SHOW PROCESSLIST statement on the MySQL website.

SHOW FULL PROCESSLIST;

Note: Run the SHOW PROCESSLIST query as the primary system user. If you're not the primary system user user, then you must have MySQL PROCESS server administration privileges to see all the threads running on a MySQL instance. Without admin privileges, SHOW PROCESSLIST shows only the threads associated with the MySQL account that you're using.

The INNODB_TRX table provides information about all currently running InnoDB transactions that aren't read-only transactions.

SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX;

The INNODB_LOCKS table provides information about locks that an InnoDB transaction has requested but hasn't received.

For MySQL 5.7 or earlier:

SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCKS;

For MySQL 8.0:

SELECT * FROM performance_schema.data_locks;

The INNODB_LOCK_WAITS table provides one or more rows for each blocked InnoDB transaction.

For MySQL 5.7 or earlier:

SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCK_WAITS;

For MySQL 8.0:

SELECT * FROM performance_schema.data_lock_waits;

You can run a query similar to the following to see the transactions that are waiting, and the transactions that are blocking the waiting transactions. For more information, see Using InnoDB transaction and locking information on the MySQL website.

For MySQL 5.7 or earlier:

SELECT
  r.trx_id waiting_trx_id,
  r.trx_mysql_thread_id waiting_thread,
  r.trx_query waiting_query,
  b.trx_id blocking_trx_id,
  b.trx_mysql_thread_id blocking_thread,
  b.trx_query blocking_query
FROM       information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b
  ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r
  ON r.trx_id = w.requesting_trx_id;

For MySQL 8.0:

SELECT
  r.trx_id waiting_trx_id,
  r.trx_mysql_thread_id waiting_thread,
  r.trx_query waiting_query,
  b.trx_id blocking_trx_id,
  b.trx_mysql_thread_id blocking_thread,
  b.trx_query blocking_query
FROM       performance_schema.data_lock_waits w
INNER JOIN information_schema.innodb_trx b
  ON b.trx_id = w.blocking_engine_transaction_id
INNER JOIN information_schema.innodb_trx r
  ON r.trx_id = w.requesting_engine_transaction_id;

The SHOW ENGINE INNODB STATUS query provides information from the standard InnoDB monitor about the state of the InnoDB storage engine. For more information, see SHOW ENGINE statement on the MySQL website.

SHOW ENGINE INNODB STATUS;

The SHOW [GLOBAL | SESSION] STATUS provides information about the server status. For more information, see SHOW STATUS statement on the MySQL website.

SHOW GLOBAL STATUS;

Note: These queries were tested on Aurora 2.x (MySQL 5.7); Aurora 1. x (MySQL 5.6); MariaDB 10.x. Additionally, the INFORMATION_SCHEMA.INNODB_LOCKS table is no longer supported as of MySQL 5.7.14 and removed in MySQL 8.0. The performance_schema.data_locks table replaces the INFORMATION_SCHEMA.INNODB_LOCKS table. For more information, see The data_locks table on the MySQL website.

Analyzing logs and turning on monitoring

When you analyze logs or want to activate monitoring in Amazon RDS for MySQL, consider the following approaches:

  • Analyze the MySQL General Query Log to view what the mysqld is doing at a specific time. You can also view the queries that are running on your instance at a specific time, including information about when clients connect or disconnect. For more information, see The General Query Log on the MySQL website.
    Note: When you activate the General Query Log for long periods, the logs consume storage and can add to performance overhead.
  • Analyze the MySQL Slow Query Logs to find queries that take longer to run than the seconds that you set for long_query_time. You can also review your workload and analyze your queries to improve performance and memory consumption. For more information, see The Slow Query Log on the MySQL website. Tip: When you use Slow Query Log or General Query Log, set the parameter log_output to FILE.
  • Use the MariaDB Audit Plugin to audit database activity. For example, you can track users that are logging on to the database or queries that are run against the database. For more information, see MariaDB Audit Plugin support.
  • If you use Aurora for MySQL, then you can also use Advanced Auditing. Auditing can give you more control over the types of queries you want to log. Doing so reduces the overhead for logging.
  • Use the innodb_print_all_deadlocks parameter to check for deadlocks and resource locking. You can use this parameter to record information about deadlocks in InnoDB user transactions in the MySQL error log. For more information, see innodb_print_all_deadlocks on the MySQL website.

Analyzing and optimizing the high CPU workload

After you identify the query that's increasing CPU usage, optimize your workload to reduce the CPU consumption.

If you see a query that's not required for your workload, you can terminate the connection using the following command:

CALL mysql.rds_kill(processID);

To find the processID of a query, run the SHOW FULL PROCESSLIST command.

If you don't want to end the query, then optimize the query using EXPLAIN. The EXPLAIN command shows the individual steps involved in running a query. For more information, see Optimizing Queries with EXPLAIN on the MySQL website.

To review profile details, activate PROFILING. The PROFILING command can indicate resource usage for statements that are running during the current session. For more information, see SHOW PROFILE statement on the MySQL website.

To update any table statistics, use ANALYZE TABLE. The ANALYZE TABLE command can help the optimizer choose an appropriate plan to run the query. For more information, see ANALYZE TABLE statement on the MySQL website.


Related information

Amazon RDS for MySQL

Amazon RDS for MariaDB

How do I activate and monitor logs for an Amazon RDS MySQL DB instance?

Tuning Amazon RDS for MySQL with Performance Insights