recovering from broker failures in MSK
If my cluster is setup as follows: Brokers: 3 AZ: 3 RF: 3 MinISR: 1 Ack: all
Q1: If a broker is being upgraded, Kafka will reassign the leadership of some partitions. After the upgrade will the leaderships get reassigned again so that all brokers are being used as before?
Q2: If 1 AZ (AZ1) goes down, I understand that Kafka will automatically reassign the partitions to the other brokers in the two AZs without impacting the producers and consumers. When AZ1 comes back will MSK automatically create/restart the failed broker and redistribute the partitions?
Please find answers inline:
Q1: If a broker is being upgraded, Kafka will reassign the leadership of some partitions. After the upgrade will the leaderships get reassigned again so that all brokers are being used as before?
- Upgrades will be in done in a rolling fashion on each broker at a time. So for example in a 3 broker cluster when broker 1 is undergoing upgrade, all the leadership that broker 1 contains will be reassigned to broker 2 and broker 3. When upgrade is complete and all 3 brokers are active, current partition leadership ratio between brokers is validated against a broker config parameter 'leader.imbalance.per.broker.percentage' which by default 10% and accordingly leadership is distributed so all brokers gets leader reassigned again after upgrade.
Q2: If 1 AZ (AZ1) goes down, I understand that Kafka will automatically reassign the partitions to the other brokers in the two AZs without impacting the producers and consumers. When AZ1 comes back will MSK automatically create/restart the failed broker and redistribute the partitions?
- That's correct, once the AZ comes back failed brokers will be relaunched and added to the existing cluster topology and then leader partitions will be distributed automatically
Relevant questions
Clarifying MSK data transfer pricing within region or AZ
Accepted Answerasked 3 years agoBroker host name convention
asked 3 years agojava.nio.channels.UnresolvedAddressException when trying to create a topic in Amazon MSK
asked 6 months agoKafka Monitoring Dashboard question - CPU Usage by Broker
asked 3 years agoAutomatically update brokerstring in MSK
Accepted Answerrecovering from broker failures in MSK
Accepted Answerasked 6 months agoAre the broker IP addresses static?
asked a year agoData loss while MSK is in an HEALING state.
asked 2 months agoDoes Updating an Active MSK cause data loss
Accepted Answerasked 6 months agoIoT Core is not an MQTT broker?
Accepted Answerasked 3 years ago