Why did aurora spike and lock up?

0

This morning our aurora mysql serverless 2 spiked and basically locked up. We have a writer/reader and a reader associated with the cluster.

On the reader we had an intense select query. This is the query RDS shows (field names changed), the point is it has multiple FIND_IN_SET functions and insights show it examined 109826.50 rows.

SELECT thisstuff ( thistime ) AS thestuff , COUNT ( * ) AS count FROM mytable WHERE todo = ? AND someID = ? AND ( ( FIND_IN_SET ( otherID , ? ) ) || ( FIND_IN_SET ( otherID , ? ) ) || ( FIND_IN_SET ( otherID , ? ) ) ) AND ( thistime BETWEEN ? AND ? ) AND some_status = ? GROUP BY months_todo ( thistime )

At the same time our writer had the following waits going on. wait/io/redo_log_flush synch/sxlock/innodb/hash_table_locks wait/io/table/sql/handler

Can somebody please help me understand why this happened, as in did the writer lock the tables or did the FIND_IN_SET lock the writer? The writer was making updates to the same table at the time the query was going on. Most importantly there a way to prevent this?

Thank You Mike

mmoore
gefragt vor 3 Monaten115 Aufrufe
1 Antwort
1
Akzeptierte Antwort

It seems there were a few factors contributing to the slow SELECT query performance on your Aurora MySQL serverless cluster:

The query was doing multiple FIND_IN_SET operations which can be expensive, especially if the tables are large and it has to examine a lot of rows.

At the same time, the writer was experiencing waits on redo log flushes, table locks and innodb hash table locks. This indicates the server was under heavy load and experiencing locking contention from concurrent DML operations.

You mentioned the cluster spiked and locked up, so the resources were likely overutilized as well. Aurora Serverless clusters are meant to auto scale but may not have been able to keep up with sudden spikes in load.

Some things you could try:

  • Check CPU, memory and I/O metrics to identify resource bottlenecks
  • Add indexes on columns used in WHERE, JOIN and ORDER BY clauses to avoid full table scans
  • Rewrite the query using temporary tables or subqueries to avoid multiple FIND_IN_SET calls
  • Increase the capacity settings if spikes are frequent to allow more headroom
profile picture
EXPERTE
beantwortet vor 2 Monaten
profile picture
EXPERTE
überprüft vor 2 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen