Increased latency after switching to Aurora I/O optimized

0

We recently switched our database from using the simple RDS PostgreSQL to Aurora I/O Optimized. For our use-case, it turns out that this did not lead to noticeable cost savings, but at least we now have a more resilient setup, running in a cluster, and this makes it trivial to add more read replicas in the future, if and when the need arises. The problem is, though, that the latency of our queries went way up. We had the previous setup tuned quite well, and our average latency was well below 100ms. After switching, the average shot up to maybe 1000ms. This is a bit puzzling, as Aurora I/O Optimized is also advertised as having better performance than the other options. When looking at the performance insights, the only obvious difference is that the read replica spends a lot of time on IO:SLRURead, which was not there at all on the old setup. I've tried searching around for more information on this, but came up pretty much empty handed. The question, I guess is: how can I figure out what the cause of this degraded performance might be and what parameters could I maybe tune in order to attempt to improve things?

Chris
asked 6 months ago1297 views
2 Answers
0
Accepted Answer

Hi,

IO:SLRURead when a pg process is waiting for a read of a simple least-recently used (SLRU) page.

An example of such situation with solution of heavy use of SLRU pages is here: https://about.gitlab.com/blog/2021/09/29/why-we-spent-the-last-month-eliminating-postgresql-subtransactions/

This deck is a similar issue: https://pgconf.in/files/presentations/2023/Dilip_Kumar_RareExtremelyChallengingPostgresPerformanceProblems.pdf

Maybe it matches your use case?

Another possibility is lots of sequential table scans: are EXPLAIN ANALYZE plans for your queries indicating a lot of sequential scans?

Best,

Didier

profile pictureAWS
EXPERT
answered 6 months ago
0

Thank you for the answer, Didier.

I spent the last couple of weeks monitoring, trying to figure out the cause of the slow queries I was seeing, and in the end it turned out it was mostly the fault of sequelize generating inefficient queries, in a few specific instances. I rewrote them as raw SQL queries, and everything is back to normal now, performance-wise.

The IO:SLRURead really threw me off, but I think it was just a coincidence that I started seeing this after migrating to Aurora I/O Optimized.

For anybody in the future, that has a similar issue, and suspects that using Aurora I/O Optimized could be the root of performance issues, I can say that one can expect the performance to be pretty much on par with running on plain RDS for PostgreSQL.

Thanks again!

Chris
answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions