Which EKS Cluster Upgrade Methods should I use?

0

Which EKS Cluster Upgrade Methods should I use? In-place or Blue/Green?

asked a month ago104 views
2 Answers
1

This document compares two methods for upgrading Amazon EKS clusters: In-Place Upgrade and Blue/Green Deployment.

In-Place Upgrade

Advantages:

  1. Maintains same API endpoint and OIDC
  2. Simpler process for sequential updates (one version at a time)
  3. No DNS or Load Balancers changes required
  4. Lower complexity in resource management and costs
  5. Ideal for clusters that are only 1-2 versions behind

Disadvantages:

  1. Limited to upgrading one version at a time
  2. Requires sequential update of control plane and data plane
  3. Higher risk during upgrade due to changes in current environment
  4. Requires careful configuration of PodDisruptionBudgets and topologySpreadConstraints

Blue/Green Deployment

Advantages:

  1. Allows multiple version jumps at once (e.g., 1.25 to 1.30)
  2. Easy rollback (switch back to old cluster)
  3. Enables gradual workload migration
  4. Current production environment remains intact during migration
  5. Best option for significantly outdated clusters
  6. Allows complete testing before final migration

Disadvantages:

  1. Changes in API Endpoint and OIDC required
  2. Duplicate costs during migration
  3. Need to verify cluster limits per region
  4. Requires coupled workloads to be migrated together
  5. Complex migration for stateful applications (backup/restore needed)
  6. External DNS and Load Balancers modifications required

Important Recommendations for Both Methods

Validation Tools:

Pre-Upgrade Checks:

  • Add-ons compatibility
  • Deprecated APIs
  • Velero backup
  • Manifest validation

Documentation:

  • Create upgrade runbook
  • Document all steps
  • Rollback plan
  • Plan regular updates (at least once per year)

Decision Factors

The choice between in-place and blue/green should consider:

  • Current cluster state
  • Version differences
  • Availability requirements
  • Available budget
  • Environment complexity
  • Workload coupling
  • Team capacity

For more detailed information, see Amazon EKS Best Practices Guide for Cluster Upgrades and How do I plan an upgrade strategy for an Amazon EKS cluster?

profile pictureAWS
answered a month ago
profile pictureAWS
EXPERT
reviewed a month ago
0

Both in-place and blue/green upgrade methods have their merits for EKS cluster upgrades, and the choice depends on your specific requirements and constraints.

In-place upgrades are generally simpler and involve updating the existing cluster components. This method is suitable when you're upgrading to the next minor version and want to minimize resource usage. It's typically less complex and can be done using managed node groups or EKS on Fargate for simplified data plane upgrades.

Blue/green upgrades, on the other hand, involve creating an entirely new cluster with the desired version and gradually migrating workloads. This method offers several benefits:

  1. Ability to upgrade multiple EKS versions at once (e.g., from 1.23 to 1.25)
  2. Option to switch back to the old cluster if issues arise
  3. Opportunity to manage the new cluster with updated systems (e.g., Terraform)
  4. Flexibility to migrate workloads individually

However, blue/green upgrades also have some downsides:

  1. More complex to implement, especially with stateful workloads
  2. Requires managing traffic flow outside of Kubernetes (using CDN or DNS)
  3. Uses twice the resources (IPs, control planes, storage, workers, load balancers)
  4. Can be more expensive
  5. May cause unnecessary bin packing when migrating subgroups of nodes

When deciding between the two methods, consider factors such as your upgrade goals, resource constraints, workload types, and risk tolerance. If you need to make significant changes or want more control over the migration process, a blue/green upgrade might be preferable. If you're looking for a simpler, less resource-intensive approach and are comfortable with the limitations, an in-place upgrade could be the better choice.

Ultimately, the best method depends on your specific use case and organizational requirements. It's important to carefully evaluate both options and choose the one that aligns best with your needs and constraints.
Sources
Best Practices for Cluster Upgrades - Amazon EKS
Community | Amazon EKS Upgrade Guidance (v1.25 to 1.26)

profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions