AWS EKS instance not found when attaching EBS volume

0

A few days ago attaching EBS volumes suddenly stopped working. My EKS cluster uses ebs.csi.aws.com addon with dynamic provisioning.

here is my storageClass config

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

and volumeClaimTemplate in my sts config

  volumeClaimTemplates:
  - metadata:
      name: log
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

after sts deployment a PVC, PV and VolumeAttachment are created, however the pod is stuck in ContainerCreating state with error AttachVolume.Attach failed for volume "pvc-xxx" : rpc error: code = NotFound desc = Instance "i-xxx" not found

I triple-checked, the volume is not attached to any other instance, and the instance exists.

One funny thing though - when I describe the created PV I see this

Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            ebs.csi.aws.com
    FSType:            ext4
    VolumeHandle:      vol-xxx
    ReadOnly:          false
    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=xxx-8081-ebs.csi.aws.com

the (unmasked) volumeHandle does not even exist.

Where might be the problem? As I said earlier, this issue popped up from day to day without changing the config

K8S version 1.24 EBS CSI Driver addon version v1.11.5-eksbuild.2 (upgrade nor downgrade didn't help)

Thanks

1 Answer
2

When you use EBS for persistent volumes, you need remebmer, that EBS is located in a single AZ, so only EC2 instance from the same AZ will be able to attach it. If a pod is rescheduled to a node in another AZ, it may fail with an error that it can not find/attach a persistent volume. Every node can have a label with its AZ, so you can use nodeSelector or Affinity to make pod be scheduled only in particular AZ

profile picture
EXPERT
answered a year ago
profile picture
EXPERT
Artem
reviewed 3 days ago
  • I believe the volume binding mode WaitForFirstConsumer should prevent AZ mismatch.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions