The InnoDB history list length increased significantly

0

Hello,

I've recently faced an issue with my Aurora database setup, highlighted by AWS DevOps Guru. The issue pertains to a significant increase in the "InnoDB history list length".

My database has been running for several months without encountering this problem, and I'm unsure why it has arisen now.

Upon inspecting the information_schema.innodb_trx table, I found two transactions that have been "locked" for over two days. Notably, these transactions lack an associated thread_id.

Can anyone advise on how to resolve this? Is there a solution that doesn't involve restarting my entire cluster?

Thank you.

4 Answers
0
Accepted Answer

Solution: https://repost.aws/knowledge-center/rds-mysql-error-1205

Section: "DML queries with XA transactions that are stuck in the PREPARED status"

profile picture
answered 6 months ago
0

Hi,

Thank you for asking your question.

Generally, commit or rollback blocking transaction will resolve this situation. But need to confirm which transaction(s) is blocking/blocked, then also confirm they can be safely rollback or terminate session.

This document stated Identify and end long-running transactions, and this knowledge base shows how to solve this situation.

Use the following methods to resolve issues with HLL growth:
 - If DML (writes) cause the HLL growth: Canceling or terminating this statement involves a rollback of the interrupted transaction. This takes a significant amount of time because all of the updates made until this point of time are being rolled back.
 - If a READ causes the HLL growth: Terminate the query using mysql.rds_kill_query.
 - Depending on how long the query is running, work with your DBA to check whether you can terminate the query using the stored procedure.

I hope this might help.

AWS
answered 6 months ago
0

Hello,

The value of trx_mysql_thread_id is set to 0.

The following query, which was recommended by AWS, returns no results. This is because the two tables cannot be joined on trx_mysql_thread_id:

SELECT a.trx_id, 
       a.trx_state, 
       a.trx_started, 
       TIMESTAMPDIFF(SECOND, a.trx_started, NOW()) AS "Duration of Open Transaction (seconds)", 
       a.trx_rows_modified, 
       b.USER, 
       b.host, 
       b.db, 
       b.command, 
       b.time, 
       b.state 
FROM   information_schema.innodb_trx a 
JOIN   information_schema.processlist b ON a.trx_mysql_thread_id = b.id
WHERE  TIMESTAMPDIFF(SECOND, a.trx_started, NOW()) > 10 
ORDER BY a.trx_started;

To address the issue, I restarted my database, even though the metrics were high. This goes against AWS's recommendations, because it takes about 30-40 minutes.

Upon restart, the issue was resolved.

profile picture
answered 6 months ago
0

Hello everyone,

Even after performing a restart, a specific transaction keeps reappearing shortly after. As time progresses, a certain metric starts to escalate, causing some concern on my end.

What's particularly unusual is that this transaction doesn't seem to have an associated thread ID, which means I can't use the KILL command to address it. Additionally, I've tried using the XA COMMIT and XA ROLLBACK commands with the provided ID, but to no avail.

Has anyone else experienced something similar or have any insights on how to tackle this?

Thanks in advance!

profile picture
answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions