スキップしてコンテンツを表示

AWS DocumentDB automatic failover when cpu utilization reaches 100%

0

Due to some bad queries, our primary instance got overwhelmed reaching the CPU utilization to 100% resulting in queries getting timed out, we had to do manual failover to replica instance in this case.

Though we are wondering why doesn't it initiate automatic failover in this case or we need to some additional configuration to enable this.

If it doesn't support automatic failover in this scenario, I could see a solution like triggering a lambda to force failover when CPU utilization alarm breaches threshold. If someone has a more simpler solution, it would be helpful.

質問済み 1年前370ビュー
1回答
1
承認された回答

Hello.

Though we are wondering why doesn't it initiate automatic failover in this case or we need to some additional configuration to enable this.

If you look at the document below, it states that there will be a failover if a database failure occurs.
In other words, even if the CPU usage rate is high, unless AWS judges it as a failure, failover will not occur.
https://docs.aws.amazon.com/documentdb/latest/developerguide/failover.html#:~:text=When%20the%20primary%20instance%20fails,has%20its%20own%20endpoint%20address.

When the primary instance fails, Amazon DocumentDB automatically fails over to an Amazon DocumentDB replica, if one exists.

If it doesn't support automatic failover in this scenario, I could see a solution like triggering a lambda to force failover when CPU utilization alarm breaches threshold. If someone has a more simpler solution, it would be helpful.

Even if you perform a manual failover, unless you identify the cause, the CPU usage will increase in the same way after the failover.
If you're okay with that, I think it's possible to automate failover by using CloudWatch alarms, Amazon SNS, and Lambda.

エキスパート
回答済み 1年前
エキスパート
レビュー済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

関連するコンテンツ