I'm running into an unusual issue where, on one specific new AWS account, I cannot create any nodegroups whatsoever. I've tried on two other AWS accounts and I can make nodegroups without any problems.
The nodegroup creation always fails with something like the following:
NodeCreationFailure
Instances failed to join the kubernetes cluster
DUMMY_2f2298a2-f492-439a-b7bb-ff931c539d78
DUMMY_5651ecbb-690e-4f3e-bc28-c52dc0d95bca
DUMMY_6db1e73c-a1c7-4258-b10d-f6994864c3ef
DUMMY_93f8d481-afd5-4811-ae28-aa2c50bd3ef5
DUMMY_950c3c89-d7ef-489d-8023-bc88a3b8a99c
DUMMY_a5e09b94-4c86-4d0b-bb12-b9630ee544de
DUMMY_bab43e87-11f8-4747-908a-06ae3741c612
DUMMY_c3f7c48a-4138-48d4-ba15-894a33f2d90a
DUMMY_cccca0c7-98ae-4bf7-8441-8124971e8a78
DUMMY_d9909a43-ebf5-4340-99f0-47281499b2e2
DUMMY_daa1703a-8032-4fa5-9eae-c8a0b04fc1dd
DUMMY_f3d0c7e8-b265-4927-98d4-33f7d4cd5ace
This occurs whether I'm using eksctl to create a new cluster from scratch with nodegroups (both when I specify how the nodegroup should be configured, or if I let it use the defaults for the initial nodegroup), if I use eksctl to create a nodegroup on an existing cluster, or if I try to create a nodegroup on an existing cluster using the AWS web client. I've tried all of these things on different accounts and had success every time. I've tried both us-west-1 and us-west-2 and had no success on the affected account, and nothing but success on the other accounts.
I looked up common sources of this issue (https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting.html) and I haven't had success with trying the suggested solutions. The IAM roles that are created with each nodegroup (before they're deleted when the creation fails) look identical to ones on working accounts, and they have the permissions AmazonEKSWorkerNodePolicy, AmazonEC2ContainerRegistryReadOnly, and AmazonEKS_CNI_Policy. I even tried making an IAM role with those three permissions and using that to make a nodegroup through the web client, and it still failed.
The VPCs that these clusters are on are configured for IPV4, not IPV6. The VPCs's main security groups have all outbound traffic allowed, and since they've been set up via eksctl, they have two public and two private subnets, with the public subnets having IP addresses auto-assigned, so they should have public internet access. The managed nodegroups created when I spin up a new cluster with eksctl seem to only be trying to use the public subnets, so they should definitely have public access.
The account I'm using has AdministratorAccess permission to the account.
I'm running out of ideas as to how to solve this. It really seems to be tied to this account, but I can't figure out what's causing this very specific problem.