When I activate Performance Insights, my DB instance shows a large number of Average Active Sessions (AAS) that are waiting for synchronization (SYNCH) wait events. I want to improve the performance of my DB instance.
Short description
MySQL SYNCH wait events in Performance Insights occur when multiple database sessions are accessing the same protected objects or memory structures. The following objects are protected in Amazon Relational Database Service (Amazon RDS) for MySQL, Amazon RDS for MariaDB, and Amazon Aurora MySQL-Compatible Edition:
- The active binary log file in a binlog source instance that contains a mutex that allows only one session to read or write it at a time.
- The data dictionary that's protected for data control language (DCL) or data definition language (DDL) statement writes.
- The adaptive hash index that contains a mutex that allows only one session to read or write it at a time.
- The open table cache that allows only one session to add or remove a table from the cache.
- Each database block in the InnoDB buffer pool where only one session can modify the content of a block in memory at a time.
Resolution
Make sure that the DB instance has enough CPU resources to manage the workload
A high number of sessions that are waiting for SYNCH events causes high CPU usage. At 100% usage, the number of waiting sessions increases. To troubleshoot this issue, increase the size of your DB instance so that there's enough CPU to process the extra workload.
Because SYNCH events typically don't last long, the CPUUtilization Amazon CloudWatch metric might not correctly show the peak usage. This metric has only a 60-second granularity. To check peak usage, use one-second granularity counters in Enhanced Monitoring on the Amazon RDS console.
Increase the MySQL mutex or lock wait array
MySQL uses an internal data structure to coordinate threads with a default array size of 1. This configuration works for single-CPU machines but can cause issues on machines with several CPUs. If your workload has a large number of waiting threads, then increase the array size. Modify the MySQL innodb_sync_array_size parameter so that it's equal to or higher than the number of CPUs. For more information, see innodb_sync_array_size on the MySQL website.
Note: MySQL use the innodb_sync_array_size parameter only at database startup.
Reduce concurrency
Typically, parallelism improves throughput. However, when a large number of sessions run the same or similar activities, the sessions must have access to the same protected objects. The larger the quantity of sessions, the more CPU you use when you're waiting.
To resolve this issue, spread the activities over time, or schedule them in series. You can also bundle several operations into a single statement, such as multirow inserts.
Troubleshoot your issue based on the wait events
Take the following troubleshooting actions based on the wait event that you receive.
synch/rwlock/innodb/dict sys RW lock or synch/rwlock/innodb/dict_operation_lock
These events occur when you invoke a high number of concurrent DCLs or DDLs at the same time. To troubleshoot these issues, reduce the application's dependency on DDLs during regular application activity.
synch/cond/sql/MDL_context::COND_wait_status
This event occurs when a high number of SQLs, including selects, access a table that a DCL or DDL is modifying. To troubleshoot this issue, don't run DDL statements on high-traffic tables during regular application activity.
synch/mutex/sql/LOCK_open or synch/mutex/sql/LOCK_table_cache
These events occur when the number of tables that your sessions open exceeds the size of the table definition cache or the table open cache. To troubleshoot this issue, increase the size of the caches. For more information, see 10.4.3.1 How MySQL opens and closes tables on the MySQL website.
synch/mutex/sql/LOG
This event occurs when your database runs a large number of statements that the current logging methods can't support. If you use the TABLE output method, then use FILE instead. If you use the general log, then use Advanced Auditing in Aurora instead. If you set the long_query_time parameter to 0 or a value lower than 1, then increase the value.
synch/mutex/innodb/buf_pool_mutex, synch/mutex/innodb/aurora_lock_thread_slot_futex or synch/rwlock/innodb/index_tree_rw_lock
These events occur when a large number of similar data manipulation language (DML) statements access the same database object at the same time. To resolve this issue, use multirow statements and partitions to spread the workload over different database objects.
synch/mutex/innodb/aurora_lock_thread_slot_futex
This event occurs when one session locked a row for an update, and then another session tries to update the row. Troubleshoot this issue based the other wait events that you receive. Find and respond to the SQL statements responsible for this wait event, or find and respond to the blocking session.
synch/cond/sql/MYSQL_BIN_LOG::COND_done, synch/mutex/sql/MYSQL_BIN_LOG::LOCK_commit, or synch/mutex/sql/MYSQL_BIN_LOG::LOCK_log
These events occur when you activated binary logging, and there's a large volume of commit throughput, committing transactions, or replicas that are reading binlogs. To troubleshoot this issue, use multirow statements or group multiple statements into a single transaction.
For more information about Aurora MySQL-Compatible wait events, see Tuning Aurora MySQL with wait events and Aurora MySQL wait events.
It's a best practice to upgrade the database to a major version that's compatible with 8.0 or later. When possible, use multirow statements, or group several statements into a single transaction. In Aurora, use the Aurora Global Database instead of binary log replication, or use the aurora_binlog parameters.
Related information
Monitoring DB load with Performance Insights on Amazon RDS
Parameter groups for Amazon RDS