Potentially degraded RDS volume - Latency & I/O spikes

0

Similar to this thread https://forums.aws.amazon.com/thread.jspa?messageID=905866 , we are experiencing sudden spikes in read/write latency and queue depth on a production multi-az db.r3.4xlarge mysql instance in us-east-1. The spike lasts for a few hours and then returns to normal. This has been occurring every day for the past week now, around the same time of day (+- a few hours). There is no increase in connections, workloads, web layer traffic, cronjobs, etc. It's just queries start crawling which leads to a huge back up of active connections and ultimately results in timed out web requests.

We've turned on Enhanced Monitoring and see the physical device read/write IOs plummet. Physical device xvdi is the only physical device which has a huge jump in Avg Queue Size, Avg Request Size, Disk I/O Await, Disk I/O Util, Read Total, and Write Total.

We believe there to be a degradation issue with volume xvdi. This is for maindb in 939284280993. DM for more identifying info if needed.

Can someone from AWS please look into this ASAP?

Edited by: csscif on Jul 7, 2019 11:18 AM

Edited by: csscif on Jul 7, 2019 11:19 AM

Edited by: csscif on Jul 7, 2019 11:19 AM

csscif
질문됨 5년 전924회 조회
1개 답변
0

Hi,
I took a look at your instance, there is no issue with the storage volumes on your instance. Rather, you have 2t gp2 allocated storage and your baseline performance in this case is 6000.
Your workload is consistently using higher IOPS than baseline.
Currently your burst balance is completely depleted, so you are getting throttled at the baseline IOPS of 6000.

Here is a blog with more info about burst verus baseline:
https://aws.amazon.com/blogs/database/understanding-burst-vs-baseline-performance-with-amazon-rds-and-gp2/

You could increase IOPS by allocating a larger gp2 volume.
In your case, because you have a legacy volume layout, the conversion to larger storage will occur online but will take about 24 hours.

Alternatively you are using a lot of READ IOPS, you might be able to tune your workload to do fewer reads.

hth,
Phil

AWS
중재자
philaws
답변함 5년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠