내용으로 건너뛰기

AWS DocumentDB automatic failover when cpu utilization reaches 100%

0

Due to some bad queries, our primary instance got overwhelmed reaching the CPU utilization to 100% resulting in queries getting timed out, we had to do manual failover to replica instance in this case.

Though we are wondering why doesn't it initiate automatic failover in this case or we need to some additional configuration to enable this.

If it doesn't support automatic failover in this scenario, I could see a solution like triggering a lambda to force failover when CPU utilization alarm breaches threshold. If someone has a more simpler solution, it would be helpful.

질문됨 일 년 전371회 조회
1개 답변
1
수락된 답변

Hello.

Though we are wondering why doesn't it initiate automatic failover in this case or we need to some additional configuration to enable this.

If you look at the document below, it states that there will be a failover if a database failure occurs.
In other words, even if the CPU usage rate is high, unless AWS judges it as a failure, failover will not occur.
https://docs.aws.amazon.com/documentdb/latest/developerguide/failover.html#:~:text=When%20the%20primary%20instance%20fails,has%20its%20own%20endpoint%20address.

When the primary instance fails, Amazon DocumentDB automatically fails over to an Amazon DocumentDB replica, if one exists.

If it doesn't support automatic failover in this scenario, I could see a solution like triggering a lambda to force failover when CPU utilization alarm breaches threshold. If someone has a more simpler solution, it would be helpful.

Even if you perform a manual failover, unless you identify the cause, the CPU usage will increase in the same way after the failover.
If you're okay with that, I think it's possible to automate failover by using CloudWatch alarms, Amazon SNS, and Lambda.

전문가
답변함 일 년 전
전문가
검토됨 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

관련 콘텐츠