Why did aurora spike and lock up?

0

This morning our aurora mysql serverless 2 spiked and basically locked up. We have a writer/reader and a reader associated with the cluster.

On the reader we had an intense select query. This is the query RDS shows (field names changed), the point is it has multiple FIND_IN_SET functions and insights show it examined 109826.50 rows.

SELECT thisstuff ( thistime ) AS thestuff , COUNT ( * ) AS count FROM mytable WHERE todo = ? AND someID = ? AND ( ( FIND_IN_SET ( otherID , ? ) ) || ( FIND_IN_SET ( otherID , ? ) ) || ( FIND_IN_SET ( otherID , ? ) ) ) AND ( thistime BETWEEN ? AND ? ) AND some_status = ? GROUP BY months_todo ( thistime )

At the same time our writer had the following waits going on. wait/io/redo_log_flush synch/sxlock/innodb/hash_table_locks wait/io/table/sql/handler

Can somebody please help me understand why this happened, as in did the writer lock the tables or did the FIND_IN_SET lock the writer? The writer was making updates to the same table at the time the query was going on. Most importantly there a way to prevent this?

Thank You Mike

mmoore
질문됨 3달 전114회 조회
1개 답변
1
수락된 답변

It seems there were a few factors contributing to the slow SELECT query performance on your Aurora MySQL serverless cluster:

The query was doing multiple FIND_IN_SET operations which can be expensive, especially if the tables are large and it has to examine a lot of rows.

At the same time, the writer was experiencing waits on redo log flushes, table locks and innodb hash table locks. This indicates the server was under heavy load and experiencing locking contention from concurrent DML operations.

You mentioned the cluster spiked and locked up, so the resources were likely overutilized as well. Aurora Serverless clusters are meant to auto scale but may not have been able to keep up with sudden spikes in load.

Some things you could try:

  • Check CPU, memory and I/O metrics to identify resource bottlenecks
  • Add indexes on columns used in WHERE, JOIN and ORDER BY clauses to avoid full table scans
  • Rewrite the query using temporary tables or subqueries to avoid multiple FIND_IN_SET calls
  • Increase the capacity settings if spikes are frequent to allow more headroom
profile picture
전문가
답변함 2달 전
profile picture
전문가
검토됨 2달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인