recovering from broker failures in MSK

1

If my cluster is setup as follows: Brokers: 3 AZ: 3 RF: 3 MinISR: 1 Ack: all

Q1: If a broker is being upgraded, Kafka will reassign the leadership of some partitions. After the upgrade will the leaderships get reassigned again so that all brokers are being used as before?

Q2: If 1 AZ (AZ1) goes down, I understand that Kafka will automatically reassign the partitions to the other brokers in the two AZs without impacting the producers and consumers. When AZ1 comes back will MSK automatically create/restart the failed broker and redistribute the partitions?

1 回答
0
已接受的回答

Please find answers inline:

Q1: If a broker is being upgraded, Kafka will reassign the leadership of some partitions. After the upgrade will the leaderships get reassigned again so that all brokers are being used as before?

  • Upgrades will be in done in a rolling fashion on each broker at a time. So for example in a 3 broker cluster when broker 1 is undergoing upgrade, all the leadership that broker 1 contains will be reassigned to broker 2 and broker 3. When upgrade is complete and all 3 brokers are active, current partition leadership ratio between brokers is validated against a broker config parameter 'leader.imbalance.per.broker.percentage' which by default 10% and accordingly leadership is distributed so all brokers gets leader reassigned again after upgrade.

Q2: If 1 AZ (AZ1) goes down, I understand that Kafka will automatically reassign the partitions to the other brokers in the two AZs without impacting the producers and consumers. When AZ1 comes back will MSK automatically create/restart the failed broker and redistribute the partitions?

  • That's correct, once the AZ comes back failed brokers will be relaunched and added to the existing cluster topology and then leader partitions will be distributed automatically
AWS
支持工程师
已回答 2 年前
AWS
专家
已审核 2 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则