Skip to content

EKS Terraform Upgrade Issue v17.24.0 -> v20.34.0

0

As a part of upgrading the infrastructure of our stack, we are upgrading the EKS Terraform version from v17.24.0 -> v20.34.0. As a part of the upgrade, one of issues being tested is the upgrade/migration of existing clusters in our platform - however, one of the current problems is that the current Terraform plan tries to replace the entire "aws_eks_cluster" resource due to the "role_arn" attribute being changed, which attempts to destroy and recreate the entire EKS cluster. Because existing production cluster workloads contain deployments that cannot be interrupted due to downtime, this should be avoided.

When looking at the variable in which "role_arn" is assigned to in each version, one can see that they are different: v17.24.0 https://github.com/terraform-aws-modules/terraform-aws-eks/blob/e90c877a741ab3cc4215376a70f7bcc360b6a3d2/main.tf#L16C3-L16C11

https://github.com/terraform-aws-modules/terraform-aws-eks/blob/e90c877a741ab3cc4215376a70f7bcc360b6a3d2/locals.tf#L14

v20.34.0 https://github.com/terraform-aws-modules/terraform-aws-eks/blob/dd2089b73b4ff296e519830efdfda760e8d52b8a/main.tf#L39

https://github.com/terraform-aws-modules/terraform-aws-eks/blob/dd2089b73b4ff296e519830efdfda760e8d52b8a/main.tf#L23

However the Terraform plan does not provide sufficient information as to what the ARN is changing to, as the following message appears in the terraform plan log for the "aws_eks_cluster" object:

resource "aws_eks_cluster" "this" { ... role_arn = "arn:aws:iam::[ACCOUNT_ID]:role/[ROLE_NAME]" # forces replacement -> (known after apply) # forces replacement ... }

Is there any insight as to what steps should be taken to preserve the role_arn as previously declared in version 17.24.0 of the module so that a destroy/recreate does not occur for the cluster being upgraded to the new version? Thanks.

asked a year ago410 views
1 Answer
1

When upgrading the EKS Terraform module from v17.24.0 to v20.34.0, the issue you're experiencing with the role_arn attribute forcing a cluster replacement is a common challenge. This happens because the module structure has changed significantly between these versions.

The key to avoiding cluster recreation is to ensure the role ARN remains exactly the same. Here are some approaches to resolve this:

  1. Use the moved block: In Terraform 1.1+, you can use a moved block to tell Terraform that a resource has been moved to a new address without requiring recreation. This helps Terraform understand that the resource is the same despite structural changes in the module.

  2. Import the existing state: You might need to import the existing cluster state into the new module structure to maintain continuity.

  3. Use ignore_changes: Add a lifecycle block with ignore_changes = [role_arn] to prevent Terraform from trying to modify this attribute.

  4. Explicitly set the role_arn: Instead of letting the module calculate it, explicitly provide the exact same role ARN that was used previously as an input variable to the module.

  5. Staged migration: Consider a staged approach where you first modify your Terraform configuration to explicitly reference the existing role ARN before upgrading the module version.

Before proceeding with the upgrade in production, it's highly recommended to test this approach in a staging environment that mirrors your production setup as closely as possible. This will help identify any other potential issues with the upgrade process.

If you're using Terraform with a CI/CD pipeline, you might want to temporarily disable automatic applies and carefully review the plan output before proceeding with the actual apply.
Sources
EKS Upgrade from 1.29 to 1.30 | AWS re:Post
Terraform and ECS: Task definition revision not sync with terraform state file | AWS re:Post

answered a year ago
EXPERT
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.