Why did aurora spike and lock up?

0

This morning our aurora mysql serverless 2 spiked and basically locked up. We have a writer/reader and a reader associated with the cluster.

On the reader we had an intense select query. This is the query RDS shows (field names changed), the point is it has multiple FIND_IN_SET functions and insights show it examined 109826.50 rows.

SELECT thisstuff ( thistime ) AS thestuff , COUNT ( * ) AS count FROM mytable WHERE todo = ? AND someID = ? AND ( ( FIND_IN_SET ( otherID , ? ) ) || ( FIND_IN_SET ( otherID , ? ) ) || ( FIND_IN_SET ( otherID , ? ) ) ) AND ( thistime BETWEEN ? AND ? ) AND some_status = ? GROUP BY months_todo ( thistime )

At the same time our writer had the following waits going on. wait/io/redo_log_flush synch/sxlock/innodb/hash_table_locks wait/io/table/sql/handler

Can somebody please help me understand why this happened, as in did the writer lock the tables or did the FIND_IN_SET lock the writer? The writer was making updates to the same table at the time the query was going on. Most importantly there a way to prevent this?

Thank You Mike

mmoore
asked 2 months ago104 views
1 Answer
1
Accepted Answer

It seems there were a few factors contributing to the slow SELECT query performance on your Aurora MySQL serverless cluster:

The query was doing multiple FIND_IN_SET operations which can be expensive, especially if the tables are large and it has to examine a lot of rows.

At the same time, the writer was experiencing waits on redo log flushes, table locks and innodb hash table locks. This indicates the server was under heavy load and experiencing locking contention from concurrent DML operations.

You mentioned the cluster spiked and locked up, so the resources were likely overutilized as well. Aurora Serverless clusters are meant to auto scale but may not have been able to keep up with sudden spikes in load.

Some things you could try:

  • Check CPU, memory and I/O metrics to identify resource bottlenecks
  • Add indexes on columns used in WHERE, JOIN and ORDER BY clauses to avoid full table scans
  • Rewrite the query using temporary tables or subqueries to avoid multiple FIND_IN_SET calls
  • Increase the capacity settings if spikes are frequent to allow more headroom
profile picture
EXPERT
answered 2 months ago
profile picture
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions