EKS pods see inconsistent data when using EFS PV
Hello!
Short summary of context and issue
I am using EFS to mount a PV (ReadWriteMany access-mode) via a PVC into EKS pods. The issue I'm having is that write updates propagate with big delays across pods: one pod may successfully write a file to the shared directory, but other pods see it some 10-60 seconds later (this delay varies across experiments seemingly at random).
Experiment & Concrete results
I run two simple pods. Pod1 runs first and continuously checks if /workdir/share_point/example.txt
exists via the stat
command. Pod2 runs second and writes the file, then does the same checks. As can be seen from the logs below, the file created at 16:52:36.544
is visible in Pod1 only at ~16:52:57.694
Logs of Pod1: pod1.log
Logs of Pod2: pod2.log
Expected results
I expected that Pod1 sees the file as soon as it is successfully written, as is the case for Pod2. As far as I understand, this would fit the consistency model described in the docs.
Worth Mentioning
If I manually kubectl exec
into the pods and attempt something similar, the problem seems to not be there, see manual_test.log
- Pod2:
echo "Manual test" > /workdir/share_point/manual_test.txt
- Pod1:
date +\"%T.%3N\" && stat /workdir/share_point/manual_test.txt
Steps to Reproduce
In what follows, I provide the simplest setup that reproduces the issue that I have. Following AWS docs, I set up a VPC, an EKS cluster and an EFS as a storage provider for the cluster. Each section below refers to the documentation I've followed and provides the commands used.
VPC
Follows creating-a-vpc. Creates a VPC from a template, will have 2 private and 2 public subnets with suitable configuration to host an EKS cluster.
aws cloudformation create-stack --stack-name public-private-subnets \ --template-url https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/amazon-eks-vpc-private-subnets.yaml
EKS cluster
Follows create-cluster. I specify the cluster name, region, and for simplicity manually copy the subnet IDs of the above VPC.
eksctl create cluster --name my-demo-cluster --region eu-central-1 \ --with-oidc --version 1.24 --node-ami-family Ubuntu2004 \ --vpc-private-subnets private_subnet1_id,private_subnet2_id \ --vpc-public-subnets public_subnet1_id,public_subnet2_id \ --node-private-networking --managed
EFS setup
Follows the efs-csi-page
Create a Policy
curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-efs-csi-driver/master/docs/iam-policy-example.json
aws iam create-policy \ --policy-name AmazonEKS_EFS_CSI_Driver_Policy \ --policy-document file://iam-policy-example.json
Create a ServiceAccount
Replace account-id accordingly in the command below.
eksctl create iamserviceaccount \ --cluster my-demo-cluster \ --namespace kube-system \ --name efs-csi-controller-sa \ --attach-policy-arn arn:aws:iam::account-id:policy/AmazonEKS_EFS_CSI_Driver_Policy \ --approve \ --region eu-central-1
Install the EFS CSI Driver
helm repo add aws-efs-csi-driver https://kubernetes-sigs.github.io/aws-efs-csi-driver/ helm repo update
helm upgrade -i aws-efs-csi-driver aws-efs-csi-driver/aws-efs-csi-driver \ --namespace kube-system \ --set image.repository=602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/aws-efs-csi-driver \ --set controller.serviceAccount.create=false \ --set controller.serviceAccount.name=efs-csi-controller-sa
Creating the EFS, SG and mount points
For simplicity, manually copy the subnet IDs of the VPC.
./complete_efs_setup.sh private_subnet1_id private_subnet2_id
Kubernetes StorageClass and PVC
Replace the Filesystem ID in kubernetes_storage/efs-storageclass.yaml:
kubectl apply -f kubernetes_storage/efs-storageclass.yaml
kubectl apply -f kubernetes_storage/efs-pvc.yaml
Deploy pods
kubectl apply -f debugging_pods/pod1.yaml
After the first one is running:
kubectl apply -f debugging_pods/pod2.yaml
Exec in pods
kubectl exec --stdin --tty pod1 -- /bin/bash
kubectl exec --stdin --tty pod2 -- /bin/bash
Relevant system information
Output of aws --version
aws-cli/2.8.7 Python/3.9.11 Linux/5.15.0-58-generic exe/x86_64.ubuntu.22 prompt/off
Output of eksctl version
0.125.0
Output of helm version
version.BuildInfo{Version:"v3.10.3", GitCommit:"835b7334cfe2e5e27870ab3ed4135f136eecc704", GitTreeState:"clean", GoVersion:"go1.18.9"}
Thank you
I'd be very thankful for any hint/pointer as to where the issue may lie. Thank you in advance.
- 最新
- 最多得票
- 最多評論
相關內容
- 已提問 1 年前lg...
- 已提問 9 個月前lg...
- AWS 官方已更新 1 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 3 年前