The InnoDB history list length increased significantly

0

Hello,

I've recently faced an issue with my Aurora database setup, highlighted by AWS DevOps Guru. The issue pertains to a significant increase in the "InnoDB history list length".

My database has been running for several months without encountering this problem, and I'm unsure why it has arisen now.

Upon inspecting the information_schema.innodb_trx table, I found two transactions that have been "locked" for over two days. Notably, these transactions lack an associated thread_id.

Can anyone advise on how to resolve this? Is there a solution that doesn't involve restarting my entire cluster?

Thank you.

profile picture
已提問 7 個月前檢視次數 864 次
4 個答案
0
已接受的答案

Solution: https://repost.aws/knowledge-center/rds-mysql-error-1205

Section: "DML queries with XA transactions that are stuck in the PREPARED status"

profile picture
已回答 7 個月前
0

Hi,

Thank you for asking your question.

Generally, commit or rollback blocking transaction will resolve this situation. But need to confirm which transaction(s) is blocking/blocked, then also confirm they can be safely rollback or terminate session.

This document stated Identify and end long-running transactions, and this knowledge base shows how to solve this situation.

Use the following methods to resolve issues with HLL growth:
 - If DML (writes) cause the HLL growth: Canceling or terminating this statement involves a rollback of the interrupted transaction. This takes a significant amount of time because all of the updates made until this point of time are being rolled back.
 - If a READ causes the HLL growth: Terminate the query using mysql.rds_kill_query.
 - Depending on how long the query is running, work with your DBA to check whether you can terminate the query using the stored procedure.

I hope this might help.

AWS
已回答 7 個月前
0

Hello,

The value of trx_mysql_thread_id is set to 0.

The following query, which was recommended by AWS, returns no results. This is because the two tables cannot be joined on trx_mysql_thread_id:

SELECT a.trx_id, 
       a.trx_state, 
       a.trx_started, 
       TIMESTAMPDIFF(SECOND, a.trx_started, NOW()) AS "Duration of Open Transaction (seconds)", 
       a.trx_rows_modified, 
       b.USER, 
       b.host, 
       b.db, 
       b.command, 
       b.time, 
       b.state 
FROM   information_schema.innodb_trx a 
JOIN   information_schema.processlist b ON a.trx_mysql_thread_id = b.id
WHERE  TIMESTAMPDIFF(SECOND, a.trx_started, NOW()) > 10 
ORDER BY a.trx_started;

To address the issue, I restarted my database, even though the metrics were high. This goes against AWS's recommendations, because it takes about 30-40 minutes.

Upon restart, the issue was resolved.

profile picture
已回答 7 個月前
0

Hello everyone,

Even after performing a restart, a specific transaction keeps reappearing shortly after. As time progresses, a certain metric starts to escalate, causing some concern on my end.

What's particularly unusual is that this transaction doesn't seem to have an associated thread ID, which means I can't use the KILL command to address it. Additionally, I've tried using the XA COMMIT and XA ROLLBACK commands with the provided ID, but to no avail.

Has anyone else experienced something similar or have any insights on how to tackle this?

Thanks in advance!

profile picture
已回答 7 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南