Increased latency after switching to Aurora I/O optimized

0

We recently switched our database from using the simple RDS PostgreSQL to Aurora I/O Optimized. For our use-case, it turns out that this did not lead to noticeable cost savings, but at least we now have a more resilient setup, running in a cluster, and this makes it trivial to add more read replicas in the future, if and when the need arises. The problem is, though, that the latency of our queries went way up. We had the previous setup tuned quite well, and our average latency was well below 100ms. After switching, the average shot up to maybe 1000ms. This is a bit puzzling, as Aurora I/O Optimized is also advertised as having better performance than the other options. When looking at the performance insights, the only obvious difference is that the read replica spends a lot of time on IO:SLRURead, which was not there at all on the old setup. I've tried searching around for more information on this, but came up pretty much empty handed. The question, I guess is: how can I figure out what the cause of this degraded performance might be and what parameters could I maybe tune in order to attempt to improve things?

Chris
已提问 7 个月前1412 查看次数
2 回答
0
已接受的回答

Hi,

IO:SLRURead when a pg process is waiting for a read of a simple least-recently used (SLRU) page.

An example of such situation with solution of heavy use of SLRU pages is here: https://about.gitlab.com/blog/2021/09/29/why-we-spent-the-last-month-eliminating-postgresql-subtransactions/

This deck is a similar issue: https://pgconf.in/files/presentations/2023/Dilip_Kumar_RareExtremelyChallengingPostgresPerformanceProblems.pdf

Maybe it matches your use case?

Another possibility is lots of sequential table scans: are EXPLAIN ANALYZE plans for your queries indicating a lot of sequential scans?

Best,

Didier

profile pictureAWS
专家
已回答 7 个月前
0

Thank you for the answer, Didier.

I spent the last couple of weeks monitoring, trying to figure out the cause of the slow queries I was seeing, and in the end it turned out it was mostly the fault of sequelize generating inefficient queries, in a few specific instances. I rewrote them as raw SQL queries, and everything is back to normal now, performance-wise.

The IO:SLRURead really threw me off, but I think it was just a coincidence that I started seeing this after migrating to Aurora I/O Optimized.

For anybody in the future, that has a similar issue, and suspects that using Aurora I/O Optimized could be the root of performance issues, I can say that one can expect the performance to be pretty much on par with running on plain RDS for PostgreSQL.

Thanks again!

Chris
已回答 6 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则